Funding Institutions:

CAPES (grant #1658841)
CNPq/MuZOO project (grant #305110/2016-0)
CNPq/INCT in Web Science (grant #557128/2009-9)
FAPESP/Cepid in Computational Engineering and Sciences (grant #2013/08293-7)
FAPESP-PRONEX/eScience project (grant #2011/50761-2)
The Microsoft Research FAPESP Virtual Institute/NavScales project (grant #2011/52070-7)

People involved:

Joana E. Gonzales Malaverri
Renato Beserra Sousa

Person in charge:

Claudia Bauzer Medeiros


Project Overview


The goal of this project is the design and development of a workflow-based computational framework for data quality assessment of scientific experiments. The idea is to allow combine quality attributes specified within a context by specialists and metadata on the provenance of a data set. In this context, we created ProvenFrame — a framework which uses historical information about data and process to estimate the quality of data. Using ProvenFrame as background, we developed Quality Flow, a tool in which experts can enhance their scientific workflows and their components with quality attributes. At the same time distinct users can define their quality dimensions and metrics for a given workflow. A first prototype of Quality Flow was implemented via a simple web interface, using data files generated by the Taverna Scientific Workflow Management System. Nowadays, we are migrating Quality Flow to be part of an ecosystem of tools which allow the reproducibility and management of in silico experiments.

logo eScience Unicamp
Supporting Data Quality Assessment in eScience

Figure: Supporting Data Quality Assessment in eScience



  • Joana E. G. Malaverri, André Santanchè, and Claudia Bauzer Medeiros. A
    provenance-based approach to evaluate data quality in eScience.
    Int. Journal of Metadata,
    Semantics and Ontologies 9, 1 (2014), 1518 [PDF]


  • Renato Beserra Sousa, Daniel Cintra Cugler, Joana Esther Gonzales Malaverri, and Claudia Bauzer Medeiros. A provenance-based approach to manage long term preservation of scientific data. In Data Engineering Workshops (ICDEW), 2014 IEEE 30th Int.
    Conf. on. IEEE, 162133. [PDF]


  • Renato Beserra. 2015. Quality Flow: a collaborative quality-aware platform for
    experiments in eScience.
    Master’s thesis. Instituto de Computação – Universidade Estadual
    de Campinas. [PDF]


  • Joana E. G. Malaverri. 2013 Supporting data quality assessment in eScience: a
    provenance based approach.
    PhD Dissertation [PDF]