Towards certain fixes with editing rules and master data |
| |
Authors: | Wenfei Fan Jianzhong Li Shuai Ma Nan Tang Wenyuan Yu |
| |
Affiliation: | (1) University of Edinburgh, Edinburgh, UK;(2) Harbin Institute of Technology, Harbin, China;(3) Beihang University, Beijing, China |
| |
Abstract: | A variety of integrity constraints have been studied for data cleaning. While these constraints can detect the presence of
errors, they fall short of guiding us to correct the errors. Indeed, data repairing based on these constraints may not find
certain fixes that are guaranteed correct, and worse still, may even introduce new errors when attempting to repair the data. We propose
a method for finding certain fixes, based on master data, a notion of certain regions, and a class of editing rules. A certain region is a set of attributes that are assured correct by the users. Given a certain region and master data, editing
rules tell us what attributes to fix and how to update them. We show how the method can be used in data monitoring and enrichment.
We also develop techniques for reasoning about editing rules, to decide whether they lead to a unique fix and whether they
are able to fix all the attributes in a tuple, relative to master data and a certain region. Furthermore, we present a framework and an algorithm to find certain fixes, by interacting
with the users to ensure that one of the certain regions is correct. We experimentally verify the effectiveness and scalability
of the algorithm. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|