• xmlui.mirage2.page-structure.header.title
    • français
    • English
  • Help
  • Login
  • Language 
    • Français
    • English
View Item 
  •   BIRD Home
  • LAMSADE (UMR CNRS 7243)
  • LAMSADE : Publications
  • View Item
  •   BIRD Home
  • LAMSADE (UMR CNRS 7243)
  • LAMSADE : Publications
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Browse

BIRDResearch centres & CollectionsBy Issue DateAuthorsTitlesTypeThis CollectionBy Issue DateAuthorsTitlesType

My Account

LoginRegister

Statistics

Most Popular ItemsStatistics by CountryMost Popular Authors
Thumbnail - No thumbnail

Static analysis of Taverna workflows to predict provenance patterns

Alper, Pinar; Belhajjame, Khalid; Goble, Carole (2017), Static analysis of Taverna workflows to predict provenance patterns, Future Generation Computer Systems, 75, p. 310-329. 10.1016/j.future.2017.01.004

Type
Article accepté pour publication ou publié
External document link
https://research.manchester.ac.uk/en/publications/static-analysis-of-taverna-workflows-to-predict-provenance-patter
Date
2017
Journal name
Future Generation Computer Systems
Volume
75
Publisher
Elsevier
Pages
310-329
Publication identifier
10.1016/j.future.2017.01.004
Metadata
Show full item record
Author(s)
Alper, Pinar
School of Computer Science [Manchester]
Belhajjame, Khalid
Laboratoire d'analyse et modélisation de systèmes pour l'aide à la décision [LAMSADE]
Goble, Carole
School of Computer Science [Manchester]
Abstract (EN)
Workflows have found adoption in scientific domains particularly due to their automation and provenance features. Using workflows scientists can repeat analyses with different input parameters and later use provenance to access and compare results based on these respective parameters. A common assumption is that by designing an analysis as a workflow we get parameter-to-result traceability for free by using workflow provenance. This assumption holds for cases of coarse-grained traceability where an entire workflow is subjected to repetition and all workflow parameters contribute to all results. However, this assumption is not guaranteed to hold for cases requiring finer grained traceability: where a workflow is configured with collections of parameters and analyses within a workflow are repeated with combinations of parameters from collections. In this paper we identify two dimensions that affect fine-grained traceability: (1) Factorial Design, which is the level of granularity in modelling parameters/data in workflows and in provenance that is supported by a workflow system; and (2) the practice of scientists in successfully encoding Factorial Design into workflows. Taverna is a workflow system that provides extensive features for factorial design. However it also supports a free approach to workflow design which means that scientists may create workflows which could break traceability in provenance when they run. Using a real-world Taverna workflow we show how broken traceability manifests in provenance, rendering it ineffective for accessing workflow outputs derived from particular input parameters. In order to prevent broken traceability from occurring we describe a rule-based static analysis technique which operates over workflow descriptions and anticipates patterns in provenance. Our rules exploit the well-defined execution behaviour in the Taverna system. In order to understand Factorial Design support in workflow systems in general, we provide a comparative survey. We conclude that other workflow systems also provide constructs for Factorial Design, and, similar to Taverna, they too are prone to broken traceability.
Subjects / Keywords
Scientific workflows; Provenance; Annotation; Static analysis

Related items

Showing items related by title and author.

  • Thumbnail
    LabelFlow: Exploiting Workflow Provenance to Surface Scientific Data Provenance 
    Alper, Pinar; Belhajjame, Khalid; Goble, Carole; Karagoz, pinar (2015) Communication / Conférence
  • Thumbnail
    LabelFlow Framework for Annotating Workflow Provenance 
    Alper, Pinar; Belhajjame, Khalid; Curcin, Vasa; Goble, Carole (2018) Article accepté pour publication ou publié
  • Thumbnail
    Common motifs in scientific workflows: An empirical analysis 
    Goble, Carole; Gil, Yolanda; Corcho, Oscar; Belhajjame, Khalid; Garijo, Daniel; Alper, Pinar (2014) Article accepté pour publication ou publié
  • Thumbnail
    Lineage-Preserving Anonymization of the Provenance of Collection-Based Workflows 
    Belhajjame, Khalid (2020) Communication / Conférence
  • Thumbnail
    On the Anonymization of Workflow Provenance without Compromising the Transparency of Lineage 
    Belhajjame, Khalid (2022) Article accepté pour publication ou publié
Dauphine PSL Bibliothèque logo
Place du Maréchal de Lattre de Tassigny 75775 Paris Cedex 16
Phone: 01 44 05 40 94
Contact
Dauphine PSL logoEQUIS logoCreative Commons logo