On Discovering Data Preparation Modules Using Examples
Belhajjame, Khalid (2020), On Discovering Data Preparation Modules Using Examples, in Kafeza, Eleanna; Benatallah, Boualem; Martinelli, Fabio, Service-Oriented Computing (Proceedings), Springer : Berlin Heidelberg, p. 56-65. 10.1007/978-3-030-65310-1_5
TypeCommunication / Conférence
Conference title18th International Conference, ICSOC 2020
Conference countryUnited Arab Emirates
Book titleService-Oriented Computing (Proceedings)
Book authorKafeza, Eleanna; Benatallah, Boualem; Martinelli, Fabio
Number of pages599
MetadataShow full item record
Laboratoire d'analyse et modélisation de systèmes pour l'aide à la décision [LAMSADE]
Abstract (EN)A major issue that arises when designing data-analysis pipelines is that of identifying the services (or what we refer to as modules in this paper) that are suitable for performing data preparation steps, which represents 80% of the modules that compose data analysis workflows. Such modules are ubiquitous and are used to perform, amongst other things, operations such as record retrieval, format transformation, data combination to name a few. To assist scientists in the task of discovering suitable modules, we examine, in this paper, a solution that utilizes semantic annotations describing the inputs and outputs of modules together with data examples that characterize modules’ behavior as ingredients for the discovery of data preparation modules. The discovery strategy that we devised is iterative in that it allows scientists to explore existing modules by providing feedback on data examples.
Subjects / Keywordsdata-analysis
Showing items related by title and author.
Alili, Hiba; Belhajjame, Khalid; Drira, Rim; Grigori, Daniela; Ben Ghezala, Henda Hajjami (2018) Communication / Conférence