首页 | 本学科首页   官方微博 | 高级检索  
     

最大频繁项目集的快速更新
引用本文:吉根林,杨明,宋余庆,孙志挥.最大频繁项目集的快速更新[J].计算机学报,2005,28(1):128-135.
作者姓名:吉根林  杨明  宋余庆  孙志挥
作者单位:1. 东南大学计算机科学与工程系,南京,210096;南京师范大学计算机科学系,南京,210097
2. 东南大学计算机科学与工程系,南京,210096
基金项目:国家自然科学基金 (79970 0 92 )资助 .
摘    要:挖掘最大频繁项目集是多种数据挖掘应用中的关键问题.为克服基于Apriori的最大频繁项目集挖掘算法存在的不足,DMFIA采用FP-tree存储结构及自顶向下的搜索策略,有效地提高了最大频繁项目集的挖掘效率.但对于频繁项目多而最大频繁项目集维数相对较小的情况,DMFIA要经过多层搜索且在每一层产生大量的候选项目集,因而影响算法的执行效率.为此,该文提出了DMFIA的改进算法IDMFIA(the Improved algorithm of DMFIA).IDMFIA采用自顶向下和自底向上双向搜索策略,可尽早修剪掉较短最大频繁项目集的超集和较长最大频繁项目集的子集.另外,该文还提出最大频繁项目集更新算法FUMFIA(Fast Updating Maximum Frequent Itemsets Algorithm),该算法充分利用已建立的FP-tree和已挖掘的最大频繁项目集,可对已挖掘的最大频繁项目集进行高效维护.实验结果表明,IDMFIA和FUMFIA可有效提高最大频繁项目集的挖掘和更新效率.

关 键 词:数据挖掘  频繁模式树  最大频繁项目集  更新

Fast Updating Maximum Frequent Itemsets
JI Gen-Lin,YANG Ming,SONG Yu-qing,SUN Zhi-hui.Fast Updating Maximum Frequent Itemsets[J].Chinese Journal of Computers,2005,28(1):128-135.
Authors:JI Gen-Lin  YANG Ming  SONG Yu-qing  SUN Zhi-hui
Affiliation:JI Gen Lin 1),2) YANG Ming 1),2) SONG Yu Qing 1) SUN Zhi Hui 1) 1)
Abstract:Mining maximum frequent itemsets is a key problem in many data mining applications. In order to overcome the drawbacks of Apriori like algorithm for mining maximum frequent itemsets, DMFIA was proposed, which uses FP tree structure and the search strategy of top down, hence improves the efficiency for mining maximum frequent itemsets in some situations. But for given datasets with many frequent items and each maximum frequent itemset is not long, DMFIA need to search many levels and generate lots of maximum frequent candidate itemsets in each level. Therefore, this paper proposes IDMFIA (the improved algorithm of DMFIA) for mining maximum frequent itemsets, IDMFIA can prune the all supersets of mined maximum frequent itemset that contains a few items, and prune the all nonempty subsets of long maximum frequent itemset. Furthermore, this paper introduces FUMFIA(Fast Updating Maximum Frequent Itemsets Algorithm), which can efficiently use the created FP tree and the mined maximum frequent itemsets for updating the mined maximum frequent itemsets. Experimental results show that the two algorithms are effective and efficient.
Keywords:data mining  frequent pattern tree  maximum frequent itemset  updating
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号