首页 | 本学科首页   官方微博 | 高级检索  
     

快速挖掘全局最大频繁项目集
引用本文:陆介平,杨明,孙志挥,鞠时光.快速挖掘全局最大频繁项目集[J].软件学报,2005,16(4):553-560.
作者姓名:陆介平  杨明  孙志挥  鞠时光
作者单位:东南大学,计算机科学与工程系,江苏,南京,210096;江苏大学,计算机科学与通信工程学院,江苏,镇江,212013
基金项目:Supported by the National Natural Science Foundation of China under Grant No.70371015(国家自然科学基金);the NationalNatural Science Foundation of Jiangsu Province under Grant No.BK2004058(江苏省自然科学基金)
摘    要:挖掘最大频繁项目集是多种数据挖掘应用中的关键问题.现行可用的最大频繁项目集挖掘算法大多基于单机环境,针对分布式环境下的全局最大频繁项目集挖掘尚不多见.若将基于单机环境的最大频繁项目集挖掘算法运用于分布式环境,或运用分布式环境下的全局频繁项目集挖掘算法来挖掘全局最大频繁项目集,均会产生大量的候选频繁项目集,且网络通信代价高.为此,提出了快速挖掘全局最大频繁项目集算法FMGMFI(fast mining global maximum frequent itemsets),该算法采用FP-tree存储结构,可方便地从各局部FP-tree的相关路径中得到项目集的频度,同时采用自顶向下和自底向上的双向搜索策略,可有效地降低网络通信代价.实验结果表明,FMGMF算法是有效、可行的.

关 键 词:分布式数据库  数据挖掘  频繁模式树  全局最大频繁项目集
文章编号:1000/9825/2005/16(04)0553
收稿时间:6/3/2004 12:00:00 AM
修稿时间:7/2/2004 12:00:00 AM

Fast Mining of Global Maximum Frequent Itemsets
LU Jie-Ping,YANG Ming,SUN Zhi-Hui and JU Shi-Guang.Fast Mining of Global Maximum Frequent Itemsets[J].Journal of Software,2005,16(4):553-560.
Authors:LU Jie-Ping  YANG Ming  SUN Zhi-Hui and JU Shi-Guang
Abstract:Mining maximum frequent itemsets is a key problem in data mining field with numerous important applications. The existing algorithms of mining maximum frequent itemsets are based on local databases, and very little work has been done in distributed databases. However, using the existing algorithms for the maximum frequent itemsets or using the algorithms proposed for the global frequent itemsets needs to generate a lots of candidate itemsets and requires a large amount of communication overhead. Therefore, this paper proposes an algorithm for fast mining global maximum frequent itemsets (FMGMFI), which can conveniently get the global frequency of any itemset from the corresponding paths of every local FP-tree by using frequent pattern tree and require far less communication overhead by the searching strategy of bottom-up and top-down. Experimental results show that FMGMFI is effective and efficient.
Keywords:distribute database  data mining  frequent pattern tree  global maximum frequent itemset
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号