ETL4LinkedProv: Managing Multigranular Linked Data Provenance

Authors

  • Rogers Reiche de Mendonça Petróleo Brasileiro S.A
  • Sérgio Manuel Serra da Cruz UFRRJ
  • Maria Luiza Machado Campos UFRJ

Keywords:

ETL, Linked Data, RDF, Provenance, Workflows, LOD2

Abstract

This article presents the ETL4LinkedProv approach to manage the collection and publication of provenance with distinct levels of granularity as Linked Data. The proposed approach uses ETL-workflows and a component named Provenance Collector Agent to collect two kinds of provenance (prospective and restrospective) integrating them with domain data. The component also set the granularity of the provenance to be captured. Furthermore, ETL4LinkedProv is evaluated in a real world scenario where governmental Brazilian agencies produce and publish public data sources as Linked Data. In this article we also measure the amount of the provenance generated in the runtime of ETL-workflows and in the number of published RDF triples.

Downloads

Published

2017-02-03