Data Preparation for Longitudinal Data Mining: a case study on human ageing

Authors

  • Caio Eduardo Ribeiro Pontifícia Universidade Católica de Minas Gerais
  • Luis Enrique Zárate Pontifícia Universidade Católica de Minas Gerais

Keywords:

data mining, data preparation, knowledge discovery, longitudinal data mining, preprocessing

Abstract

An adequate preparation of a database is essential to the extraction of useful knowledge contained in it. On longitudinal studies, that follow a fixed set of records through a time period, the data preparation process should adapt to the features added to the database by the temporal aspect of data. This article presents the data preparation process of a real longitudinal database, from a human ageing study. The process addresses the conceptual feature selection of the attributes in the database, and its preprocessing, including noisy data elimination, variable merging, discretization, outlier detection, and missing data analysis. The guidelines to the procedures were generalized, allowing their replication on other longitudinal databases, for similar studies.

Downloads

Published

2017-02-03