首页 | 本学科首页   官方微博 | 高级检索  
     

一种有效的不确定数据概率频繁项集挖掘算法*
引用本文:刘立新,张晓琳,毛伊敏.一种有效的不确定数据概率频繁项集挖掘算法*[J].计算机应用研究,2012,29(3):841-843.
作者姓名:刘立新  张晓琳  毛伊敏
作者单位:1. 内蒙古科技大学信息工程学院,内蒙古包头,014010
2. 中南大学信息科学与工程学院,长沙,410083
基金项目:国家自然科学基金资助项目(61163015);教育部“春晖计划”基金资助项目(Z2009-1-01024)
摘    要:针对PFIM算法中频繁概率计算方法的局限性,且挖掘时需要多次扫描数据库和生成大量候选集的不足,提出EPFIM(efficient probabilistic frequent itemset mining)算法。新提出的频繁概率计算方法能适应数据流等项集的概率发生变化时的情况;通过不确定数据库存储在概率矩阵中,以及利用项集的有序性和逐步删除无用事物来提高挖掘效率。理论分析和实验结果证明了EPFIM算法的性能更优。

关 键 词:不确定数据  可能世界  期望支持度  概率频繁项集

Efficient mining probabilistic frequent itemset in uncertain databases
LIU Li-xin,ZHANG Xiao-lin,MAO Yi-min.Efficient mining probabilistic frequent itemset in uncertain databases[J].Application Research of Computers,2012,29(3):841-843.
Authors:LIU Li-xin  ZHANG Xiao-lin  MAO Yi-min
Affiliation:(1.School of Information Engineering, Inner Mongolia University of Science & Technology, Baotou Inner Mongolia 014010, China; 2.School of Information Science & Engineering, Central South University, Changsha 410083, China)
Abstract:The way to calculate the frequentness probability in PFIM limited its applications, it needed to scan the database for many times and generated a large number of candidate sets. This paper proposed a new algorithm named EPFIM. First, the new method of calculating the frequentness probability made it easier to update frequentness probability of itemset, and could be adapted in more situations. Second, it used uncertain probability matrix to store the database in order to scan database less. In addition, the sequence of items and deleting unwanted transactions gradually improved efficiency of mining. Theoretical analysis and experimental results show EPFIM performances better.
Keywords:uncertain databases  possible word  expected support  probabilistic frequent itemset
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号