首页 | 本学科首页   官方微博 | 高级检索  
     

可扩展和可交互的数据清洗系统
引用本文:包从剑 李星毅 施化吉. 可扩展和可交互的数据清洗系统[J]. 微机发展, 2007, 17(7): 84-86
作者姓名:包从剑 李星毅 施化吉
作者单位:江苏大学计算机科学与通信工程学院 江苏镇江212013
摘    要:可扩展性和可交互性是数据清洗系统的主要特征。为了说明此系统的特点,列举产生异常数据的原因,用系统框架图来解释各个功能模块,提出用统计学等方法检测异常数据,针对不同类型的异常数据提出相应的清洗策略,并说明如何评估算法的优良性和数据准确性,最后用流程图来说明整个系统。人口数据清洗结果显示人口数据质量大幅度提高了,同时也证明此系统有很高的执行效率。

关 键 词:数据仓库  数据检测  数据清洗
文章编号:1673-629X(2007)07-0084-03
修稿时间:2006-10-04

Extendible and Interactive Data Cleaning System
BAO Cong-jian,LI Xing-yi,SHI Hua-ji. Extendible and Interactive Data Cleaning System[J]. Microcomputer Development, 2007, 17(7): 84-86
Authors:BAO Cong-jian  LI Xing-yi  SHI Hua-ji
Abstract:The prominent features of the data cleaning system are manifested in extendibility and interactivity.To describe the traits of this system firstly,list the cause of abnormal data,then apply system framework to explain each functional parts;next introduce the statistics method to detect abnormal data;following,advocate corresponding cleaning amendment strategies in accordance with various patterns of abnormal data,thus explain the fineness of appraisal and elicit quality-measuring criteria;ultimately,explain this whole system by flow chart.The results of population data cleaning manifest that the quality of population data has been greatly improved,and provide the evidences that this system bears the stamp of pretty high administrative efficiency.
Keywords:data warehouse  data detection  data cleaning
本文献已被 CNKI 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号