Summary-based Comparison of Data Quality across Public MAGE-ML Genomic Datasets
Keywords:
XML, data quality, mage-ml, functional genomic data standards and public collections, schema evolutionAbstract
Extensive microarray experimental data is available online, facilitating independent evaluation of experimentconclusions and enabling reuse. Numerous microarray experiment datasets are published using the MAGE-ML
XML schema but assessing the quality of published experiments still represents a challenging task since there is no
consensus among microarray users on a framework to measure datasets quality.
In this paper, we apply techniques based on DescribeX that quantitatively and qualitatively analyze MAGE-ML
public collections, gaining insights about schema evolution. Our case study shows that DescribeX is a useful tool for
the evaluation of microarray experiment data quality that enhances the understanding of the instance-level structure of
MAGE-ML datasets and its evolution.
Downloads
Published
2011-08-10
Issue
Section
SBBD 2010 Short Papers