Unary and n-ary inclusion dependency discovery in relational databases期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Unary and n-ary inclusion dependency discovery in relational databases

Authors:	Fabien De Marchi Stéphane Lopes Jean-Marc Petit

Affiliation:	(1) Laboratoire LIRIS, Université de LYON, Université LYON 1, CNRS UMR-5205, 69621 Villeurbanne, France;(2) Laboratoire PRISM, Université de Versailles Saint-Quentin en Yvelines, CNRS UMR-8144, 78035 Versailles Cedex, France;(3) Laboratoire LIRIS, Université de LYON, INSA de Lyon, CNRS UMR-5205, 69621 Villeurbanne, France

Abstract:	Foreign keys form one of the most fundamental constraints for relational databases. Since they are not always defined in existing databases, the discovery of foreign keys turns out to be an important and challenging task. The underlying problem is known to be the inclusion dependency (IND) inference problem. In this paper, data-mining algorithms are devised for IND inference in a given database. We propose a two-step approach. In the first step, unary INDs are discovered thanks to a new preprocessing stage which leads to a new algorithm and to an efficient implementation. In the second step, n-ary IND inference is achieved. This step fits in the framework of levelwise algorithms used in many data-mining algorithms. Since real-world databases can suffer from some data inconsistencies, approximate INDs, i.e. INDs which almost hold, are considered. We show how they can be safely integrated into our unary and n-ary discovery algorithms. An implementation of these algorithms has been achieved and tested against both synthetic and real-life databases. Up to our knowledge, no other algorithm does exist to solve this data-mining problem.

Keywords:	Inclusion dependency discovery Relational databases
本文献已被 SpringerLink 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏