• xmlui.mirage2.page-structure.header.title
    • français
    • English
  • Help
  • Login
  • Language 
    • Français
    • English
View Item 
  •   BIRD Home
  • LAMSADE (UMR CNRS 7243)
  • LAMSADE : Publications
  • View Item
  •   BIRD Home
  • LAMSADE (UMR CNRS 7243)
  • LAMSADE : Publications
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Browse

BIRDResearch centres & CollectionsBy Issue DateAuthorsTitlesTypeThis CollectionBy Issue DateAuthorsTitlesType

My Account

LoginRegister

Statistics

Most Popular ItemsStatistics by CountryMost Popular Authors
Thumbnail

Genetic Programming over Spark for Higgs Boson Classification

Hmida, Hmida; Ben Hamida, Sana; Borgi, Amel; Rukoz, Marta (2019), Genetic Programming over Spark for Higgs Boson Classification, in Abramowicz, Witold; Corchuelo, Rafael, Business Information Systems, Conference proceedings, Springer, p. 300-312. 10.1007/978-3-030-20485-3_23

View/Open
Genetic_Programming_over_Spark_for_Higgs_Boson_Classification.pdf (1.085Mb)
Type
Communication / Conférence
Date
2019
Conference title
22nd International Conference, BIS 2019
Conference date
2019-06
Conference city
Seville
Conference country
Spain
Book title
Business Information Systems, Conference proceedings
Book author
Abramowicz, Witold; Corchuelo, Rafael
Publisher
Springer
ISBN
978-3-030-20485-3
Number of pages
541
Pages
300-312
Publication identifier
10.1007/978-3-030-20485-3_23
Metadata
Show full item record
Author(s)
Hmida, Hmida
Laboratoire d'analyse et modélisation de systèmes pour l'aide à la décision [LAMSADE]
Ben Hamida, Sana cc
Laboratoire d'analyse et modélisation de systèmes pour l'aide à la décision [LAMSADE]
Borgi, Amel
Laboratoire d'Informatique, Programmation, Algorithmique et Heuristique [LIPAH]
Rukoz, Marta
Laboratoire d'analyse et modélisation de systèmes pour l'aide à la décision [LAMSADE]
Abstract (EN)
With the growing number of available databases having a very large number of records, existing knowledge discovery tools need to be adapted to this shift and new tools need to be created. Genetic Programming (GP) has been proven as an efficient algorithm in particular for classification problems. Notwithstanding, GP is impaired with its computing cost that is more acute with large datasets. This paper, presents how an existing GP implementation (DEAP) can be adapted by distributing evaluations on a Spark cluster. Then, an additional sampling step is applied to fit tiny clusters. Experiments are accomplished on Higgs Boson classification with different settings. They show the benefits of using Spark as parallelization technology for GP.
Subjects / Keywords
Genetic Programming; Machine learning; Spark; Large dataset; Higgs Boson classification

Related items

Showing items related by title and author.

  • Thumbnail
    Scale Genetic Programming for large Data Sets: Case of Higgs Bosons Classification 
    Hmida, Hmida; Ben Hamida, Sana; Borgi, Amel; Rukoz, Marta (2018) Article accepté pour publication ou publié
  • Thumbnail
    A new adaptive sampling approach for Genetic Programming 
    Hmida, Hmida; Ben Hamida, Sana; Borgi, Amel; Rukoz, Marta (2019) Communication / Conférence
  • Thumbnail
    Adaptive sampling for active learning with genetic programming 
    Ben Hamida, Sana; Hmida, Hmida; Borgi, Amel; Rukoz, Marta (2019) Article accepté pour publication ou publié
  • Thumbnail
    Sampling Methods in Genetic Programming Learners from Large Datasets: A Comparative Study 
    Hmida, Hmida; Ben Hamida, Sana; Borgi, Amel; Rukoz, Marta (2017) Communication / Conférence
  • Thumbnail
    Hierarchical Data Topology Based Selection for Large Scale Learning 
    Hmida, Hmida; Ben Hamida, Sana; Borgi, Amel; Rukoz, Marta (2016) Communication / Conférence
Dauphine PSL Bibliothèque logo
Place du Maréchal de Lattre de Tassigny 75775 Paris Cedex 16
Phone: 01 44 05 40 94
Contact
Dauphine PSL logoEQUIS logoCreative Commons logo