首页 | 本学科首页   官方微博 | 高级检索  
     

一个基于流程的数据清洗框架的研究
引用本文:董明,张芸,曹渠江. 一个基于流程的数据清洗框架的研究[J]. 计算机应用与软件, 2009, 26(9): 157-158,171
作者姓名:董明  张芸  曹渠江
作者单位:山东商业职业技术学院国际交流学院,山东,济南,250103;银座集团股份有限公司,山东,济南,250011;上海理工大学,上海,200093
摘    要:以往的数据清洗方法需要基于模式进行规则编码,费时、困难,而且后期难以修改规则.提出了一种新的相似重复记录的消除框架,可以使用户在无需编码的条件下简单地完成数据清洗工作.该框架具有开放的算法库、函数库以及基于模糊规则和成员函数的模糊推导系统,使其具有较强的通用性和适用性.最后通过实验验证了该框架的有效性.

关 键 词:数据清洗  重复记录  可扩展框架  模糊推导系统

STUDY ON PROCESS BASED DATA CLEANING FRAMEWORK
Dong Ming,Zhang Yun,Cao Qujiang. STUDY ON PROCESS BASED DATA CLEANING FRAMEWORK[J]. Computer Applications and Software, 2009, 26(9): 157-158,171
Authors:Dong Ming  Zhang Yun  Cao Qujiang
Affiliation:Shandong Institute of Commerce and Technology;Jinan 250103;Shandong;China;Yinzuo Group Co.;LTD;Jinan 250011;China;University of Shanghai for Science and Technology;Shanghai 200093;China
Abstract:Earlier approaches of data cleaning,which requires to encode rules based on a schema,were time consuming and difficult,and users could not later adapt the rules.This paper proposes a novel duplicate-elimination framework that lets users to clean data flexibly and effortlessly,without any coding.The extensible framework has open algorithms library,open functions library and Fuzzy Inference System based on fuzzy rule and membership functions,which make it universal and adaptive.At last the experimental result...
Keywords:Data cleaning Duplicate record Extensible framework Fuzzy inference system  
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号