• français
    • English
  • English 
    • français
    • English
  • Login
JavaScript is disabled for your browser. Some features of this site may not work without it.
BIRD Home

Browse

This CollectionBy Issue DateAuthorsTitlesSubjectsJournals BIRDResearch centres & CollectionsBy Issue DateAuthorsTitlesSubjectsJournals

My Account

Login

Statistics

View Usage Statistics

Genetic Programming over Spark for Higgs Boson Classification

Thumbnail
View/Open
Genetic_Programming_over_Spark_for_Higgs_Boson_Classification.pdf (1.085Mb)
Date
2019
Dewey
Programmation, logiciels, organisation des données
Sujet
Genetic Programming; Machine learning; Spark; Large dataset; Higgs Boson classification
DOI
http://dx.doi.org/10.1007/978-3-030-20485-3_23
Conference name
22nd International Conference, BIS 2019
Conference date
06-2019
Conference city
Seville
Conference country
Spain
Book title
Business Information Systems, Conference proceedings
Author
Abramowicz, Witold; Corchuelo, Rafael
Publisher
Springer
Year
2019
Pages number
541
ISBN
978-3-030-20485-3
Book URL
10.1007/978-3-030-20485-3
URI
https://basepub.dauphine.fr/handle/123456789/19994
Collections
  • LAMSADE : Publications
Metadata
Show full item record
Author
Hmida, Hmida
989 Laboratoire d'analyse et modélisation de systèmes pour l'aide à la décision [LAMSADE]
Ben Hamida, Sana
989 Laboratoire d'analyse et modélisation de systèmes pour l'aide à la décision [LAMSADE]
Borgi, Amel
253759 Laboratoire d'Informatique, Programmation, Algorithmique et Heuristique [LIPAH]
Rukoz, Marta
989 Laboratoire d'analyse et modélisation de systèmes pour l'aide à la décision [LAMSADE]
Type
Communication / Conférence
Item number of pages
300-312
Abstract (EN)
With the growing number of available databases having a very large number of records, existing knowledge discovery tools need to be adapted to this shift and new tools need to be created. Genetic Programming (GP) has been proven as an efficient algorithm in particular for classification problems. Notwithstanding, GP is impaired with its computing cost that is more acute with large datasets. This paper, presents how an existing GP implementation (DEAP) can be adapted by distributing evaluations on a Spark cluster. Then, an additional sampling step is applied to fit tiny clusters. Experiments are accomplished on Higgs Boson classification with different settings. They show the benefits of using Spark as parallelization technology for GP.

  • Accueil Bibliothèque
  • Site de l'Université Paris-Dauphine
  • Contact
SCD Paris Dauphine - Place du Maréchal de Lattre de Tassigny 75775 Paris Cedex 16

 Content on this site is licensed under a Creative Commons 2.0 France (CC BY-NC-ND 2.0) license.