首页 | 本学科首页   官方微博 | 高级检索  
     

基于模糊综合评判的相似重复记录识别方法
引用本文:肖满生,周浩慧,王宏.基于模糊综合评判的相似重复记录识别方法[J].计算机工程,2010,36(13):51-53.
作者姓名:肖满生  周浩慧  王宏
作者单位:1. 湖南工业大学科技学院,株洲,412008
2. 长沙商贸旅游职业技术学院,长沙,410004
基金项目:湖南省教育厅科研基金资助项目,湖南省科技计划基金资助项目 
摘    要:针对在基于字符串匹配的相似重复记录识别中,属性权值确定主观性太强的问题,提出一种模糊综合评判获取属性权值的方法。采用多用户对各属性的重要性组成因素进行等级评价,通过模糊映射获得反映属性重要性的权值,并以此为基础进行相似重复记录识别。理论分析和实验表明,该方法能客观地获取各属性权值,因而在相似重复记录识别中有较高的识别精度。

关 键 词:模糊综合评判  相似重复记录  属性权值  相似度

Identification Method of Approximately Duplicate Records Based on Fuzzy Integrated Estimation
XIAO Man-sheng,ZHOU Hao-hui,WANG Hong.Identification Method of Approximately Duplicate Records Based on Fuzzy Integrated Estimation[J].Computer Engineering,2010,36(13):51-53.
Authors:XIAO Man-sheng  ZHOU Hao-hui  WANG Hong
Affiliation:(1. College of Science and Technology, Hunan University of Technology, Zhuzhou 412008; 2. Changsha Commerce & Tourism College, Changsha 410004)
Abstract:Aiming at the problem of very strong subjectivity in the attribute weight determination of dataset in identifying approximately duplicate records based on the character string matching method,the paper puts forward a method based on fuzzy integrated estimation to get attribute weight.It estimates the components of all attribute’s importance by multi users,and gets the attribute’s weight through fuzzy mapping,based on which the approximately duplicate records are identified.It can be proved from theory and practice that the method can objectively get all attribute weight,thus it has a higher precision in identifying approximately duplicate records.
Keywords:fuzzy integrated estimation  approximately duplicate records  attribute weight  similarity
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《计算机工程》浏览原始摘要信息
点击此处可从《计算机工程》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号