• xmlui.mirage2.page-structure.header.title
    • français
    • English
  • Help
  • Login
  • Language 
    • Français
    • English
View Item 
  •   BIRD Home
  • LAMSADE (UMR CNRS 7243)
  • LAMSADE : Publications
  • View Item
  •   BIRD Home
  • LAMSADE (UMR CNRS 7243)
  • LAMSADE : Publications
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Browse

BIRDResearch centres & CollectionsBy Issue DateAuthorsTitlesTypeThis CollectionBy Issue DateAuthorsTitlesType

My Account

LoginRegister

Statistics

Most Popular ItemsStatistics by CountryMost Popular Authors
Thumbnail - No thumbnail

On Discovering Data Preparation Modules Using Examples

Belhajjame, Khalid (2020), On Discovering Data Preparation Modules Using Examples, in Kafeza, Eleanna; Benatallah, Boualem; Martinelli, Fabio, Service-Oriented Computing (Proceedings), Springer : Berlin Heidelberg, p. 56-65. 10.1007/978-3-030-65310-1_5

View/Open
DiscoveringDataPreparation.pdf (897.8Kb)
Type
Communication / Conférence
Date
2020
Conference title
18th International Conference, ICSOC 2020
Conference date
2020-12
Conference city
Dubai
Conference country
United Arab Emirates
Book title
Service-Oriented Computing (Proceedings)
Book author
Kafeza, Eleanna; Benatallah, Boualem; Martinelli, Fabio
Publisher
Springer
Published in
Berlin Heidelberg
ISBN
978-3-030-65310-1
Number of pages
599
Pages
56-65
Publication identifier
10.1007/978-3-030-65310-1_5
Metadata
Show full item record
Author(s)
Belhajjame, Khalid
Laboratoire d'analyse et modélisation de systèmes pour l'aide à la décision [LAMSADE]
Abstract (EN)
A major issue that arises when designing data-analysis pipelines is that of identifying the services (or what we refer to as modules in this paper) that are suitable for performing data preparation steps, which represents 80% of the modules that compose data analysis workflows. Such modules are ubiquitous and are used to perform, amongst other things, operations such as record retrieval, format transformation, data combination to name a few. To assist scientists in the task of discovering suitable modules, we examine, in this paper, a solution that utilizes semantic annotations describing the inputs and outputs of modules together with data examples that characterize modules’ behavior as ingredients for the discovery of data preparation modules. The discovery strategy that we devised is iterative in that it allows scientists to explore existing modules by providing feedback on data examples.
Subjects / Keywords
data-analysis

Related items

Showing items related by title and author.

  • Thumbnail
    Annotating the Behavior of Scientific Modules Using Data Examples: A Practical Approach 
    Belhajjame, Khalid (2014) Communication / Conférence
  • Thumbnail
    Quality Based Data Integration for Enriching User Data Sources in Service Lakes 
    Alili, Hiba; Belhajjame, Khalid; Drira, Rim; Grigori, Daniela; Ben Ghezala, Henda Hajjami (2018) Communication / Conférence
  • Thumbnail
    Verification of Semantic Web Service Annotations Using Ontology-Based Partitioning 
    Belhajjame, Khalid; Embury, Suzanne; Paton, Norman W. (2014) Article accepté pour publication ou publié
  • Thumbnail
    LabelFlow: Exploiting Workflow Provenance to Surface Scientific Data Provenance 
    Alper, Pinar; Belhajjame, Khalid; Goble, Carole; Karagoz, pinar (2015) Communication / Conférence
  • Thumbnail
    On Enriching User-Centered Data Integration Schemas in Service Lakes 
    Alili, Hiba; Belhajjame, Khalid; Grigori, Daniela; Drira, Rim; Ben Ghezala, Henda Hajjami (2017) Communication / Conférence
Dauphine PSL Bibliothèque logo
Place du Maréchal de Lattre de Tassigny 75775 Paris Cedex 16
Phone: 01 44 05 40 94
Contact
Dauphine PSL logoEQUIS logoCreative Commons logo