FMI-SiR: A Flexible and Efficient Module for Similarity Searching on Oracle Database

Authors

  • Daniel S. Kaster UEL
  • Pedro H. Bugatti ICMC-USP
  • Agma J. M. Traina ICMC-USP
  • Caetano Traina Jr. ICMC-USP

Abstract

The volume of complex data (images, videos, audio, time series, DNA sequences, and others) has been growing at a very fast pace. Although they are not naturally handled by Database Management Systems (DBMSs), it is necessary to store them in databases. Complex data are well-suited to be queried by similarity, and several works have been addressing techniques for similarity searching. However, the majority of the techniques is not conceived to be integrated into a database engine. To include similarity search into the database core requires allow taking advantage of the DBMS resources to perform queries, integrating complex and conventional data. Oracle Corp. developed the Oracle interMedia module to support multimedia data in its database manager, providing several operations to manipulate them. It allows performing content-based image retrieval through proprietary functions to extract intrinsic features from images and to compute their similarity.
In this paper we describe another module for similarity search, also developed using the Oracle's Extensible Architecture Framework. Our approach allows including user-defined feature extraction methods and distance functions into the database core, whereas providing wider flexibility. The similarity operators supported include both similarity selection on a single relation, as well as similarity range joins performed over two relations. The experiments show that employing our module to query images by content improves the results obtained using Oracle alone, both in the precision of the results and in the performance of executing queries.

Author Biography

  • Daniel S. Kaster, UEL
    Daniel S. Kaster received the B.Sc. degree in Computer Science from the University of Londrina, Brazil, in 1998 and the M.Sc. degree in Computer Science from the University of Campinas, Brazil, in 2001, under supervision of Prof. Claudia Bauzer Medeiros. He is currently a Lecturer with the Computer Science Department of the University of Londrina, Brazil, and a Ph.D. candidate in Computer Science from the University of São Paulo at São Carlos, Brazil, under supervision of Prof. Caetano Traina Jr. His research interests include searching complex data and multimedia and geographic databases.

Downloads

Published

2010-09-09