首页 | 本学科首页   官方微博 | 高级检索  
     

考虑多粒度属性约简的关联规则挖掘研究
引用本文:杨珍,耿秀丽. 考虑多粒度属性约简的关联规则挖掘研究[J]. 计算机工程与应用, 2019, 55(6): 133-139. DOI: 10.3778/j.issn.1002-8331.1712-0254
作者姓名:杨珍  耿秀丽
作者单位:上海理工大学 管理学院,上海 200093;上海理工大学 管理学院,上海 200093
基金项目:国家自然科学基金;高等学校博士学科点专项科研基金资助课题;上海市教育委员会科研创新项目;上海市一流学科项目;沪江基金
摘    要:大数据时代,人们获取所需信息的困难度提高,而数据挖掘是当下解决此问题的关键技术。Apriori算法作为数据挖掘中的常用算法,通过挖掘数据背后的潜在关联规则。考虑到传统Apriori算法执行过程中,数据扫描频繁、候选集获取繁琐等问题,提出采用加权Apriori算法,即将冗余记录存储一次,并将记录的重复次数占全部记录数的比值作为权重,压缩空间;采用二进制的布尔矩阵替代原有数据集,通过矩阵内部"与运算",获取最大频繁集,降低时间复杂度。考虑到原始数据冗余性以及粗糙集属性约简的不精确性,在提取关联规则前,提出采用多粒度粗糙集的属性约简算法,通过知识粒度细化属性值来提高约简精度,降低空间复杂度。最后,将所提方法与基于频繁矩阵的Apriori算法以及原始Apriori算法进行比较,验证所提方法的实用性和有效性。

关 键 词:多粒度粗糙集  属性约简  二进制  加权Apriori算法

Research on Mining Association Rules Based on Multi-Granularity Attribute Reduction
YANG Zhen,GENG Xiuli. Research on Mining Association Rules Based on Multi-Granularity Attribute Reduction[J]. Computer Engineering and Applications, 2019, 55(6): 133-139. DOI: 10.3778/j.issn.1002-8331.1712-0254
Authors:YANG Zhen  GENG Xiuli
Affiliation:Business School, University of Shanghai for Science and Technology, Shanghai 200093, China
Abstract:In the era of big data, it has become increasingly difficult to obtain the data. And data mining is the key technology to solve this problem at present. Apriori algorithm is a common algorithm in data mining by mining potential association rules behind the data. Considering the problems of traditional Apriori algorithm, such as frequent scan data and cumbersome acquisition of candidate items, a weighted Apriori algorithm is proposed to record the number of repetitions of the total number of records. The repetition times are taken as the weight and compression matrix of data sets. Binary Boolean matrix is used to replace the original data set, through the matrix of “AND operation” to obtain the maximum frequent item set to reduce the time complexity. Considering the redundancy of the original data and the inaccuracy of attribute reduction, an algorithm of attribute reduction based on multi-granularity rough set is proposed before the association rules are extracted. The uncertainty of the information is described by the granularity of knowledge, and the attribute value is refined to reduce the precision and reduce the space complexity. Finally, the proposed algorithm is compared with the Apriori algorithm based on frequent matrices and the original Apriori algorithm to verify its practicability and validity.
Keywords:multi-granularity rough set  attribute reduction  binary  weighted Apriori algorithm  
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号