首页 | 本学科首页   官方微博 | 高级检索  
     

基于不确定性数据的频繁闭项集挖掘算法
引用本文:章淑云,张守志. 基于不确定性数据的频繁闭项集挖掘算法[J]. 计算机工程, 2014, 0(3): 51-54
作者姓名:章淑云  张守志
作者单位:复旦大学计算机科学技术学院,上海200433
摘    要:对于不确定性数据,传统判断项集是否频繁的方法并不能准确表达项集的频繁性,同样对于大型数据,频繁项集显得庞大和冗余。针对上述不足,在水平挖掘算法Apriori的基础上,提出一种基于不确定性数据的频繁闭项集挖掘算法UFCIM。利用置信度概率表达项集频繁的准确性,置信度越高,项集为频繁的准确性也越高,且由于频繁闭项集是频繁项集的一种无损压缩表示,因此利用压缩形式的频繁闭项集替代庞大的频繁项集。实验结果表明,该算法能够快速地挖掘出不确定性数据中的频繁闭项集,在减少项集冗余的同时保证项集的准确性和完整性。

关 键 词:不确定性数据  频繁闭项集  数据挖掘  水平挖掘  置信度概率

Mining Algorithm of Frequent Closed Itemsets Based on Uncertain Data
ZHANG Shu-yun,ZHANG Shou-zhi. Mining Algorithm of Frequent Closed Itemsets Based on Uncertain Data[J]. Computer Engineering, 2014, 0(3): 51-54
Authors:ZHANG Shu-yun  ZHANG Shou-zhi
Affiliation:(School of Computer Science, Fudan University, Shanghai 200433, China)
Abstract:For the uncertain data, traditional method of judging whether an itemset is frequent cannot express how close the estimate is, meanwhile frequent itemsets are large and redundant for large datasets. Regarding to the above two disadvantages, this paper proposes a mining algorithm of frequent closed itemsets based on uncertain data called UFCIM to mine frequent closed itemsets from uncertain data according to frequent itemsets mining method from uncertain data, and it is based on level mining algorithm Apriori. It uses probability of confidence to express how close the estimate is, the larger that probability of confidence is, the itemsets are more likely to be frequent. Besides as frequent closed itemsets are compact and lossless representation of frequent itemsets, so it uses compacted frequent closed itemsets to take place of frequent itemsets which are of huge size. Experimental result shows the UFCIM algorithm can mine frequent closed itemsets effectively and quickly. It can reduce redundancy and meanwhile assure the accuracy and completeness of itemsets.
Keywords:uncertain data  frequent closed itemsets  data mining  level mining  probability of confidence
本文献已被 CNKI 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号