Polyflow: a Polystore-compliant Mechanism to Provide Interoperability to Heterogeneous Provenance Graphs
Abstract
any scientific experiments are modeled as workflows. Data from a workflow is captured by Workflow Management Systems (WfMS). Each WfMS has its own format to represent provenance (metadata that describes the generated data history), and stores it in different granularity in the form of a graph. Provenance allows scientists to analyze and evaluate results produced by a workflow. However, in more complex scenarios in which the scientist needs to analyze provenance graphs generated by multiple WfMSs and workflows, a challenge arises. To solve this problem, we propose a tool called Polyflow, which is based on the concept of Polystore systems, being able to integrate several databases of heterogeneous origin by adopting a global ProvONE schema. Polyflow allows scientists to query multiple provenance graphs in an integrated way. We evaluate Polyflow with experts using provenance data collected from real phylogenetic data analysis workflows.