首页 | 本学科首页   官方微博 | 高级检索  
     

快速挖掘全局频繁项目集
引用本文:杨明,孙志挥,吉根林.快速挖掘全局频繁项目集[J].计算机研究与发展,2003,40(4):620-626.
作者姓名:杨明  孙志挥  吉根林
作者单位:东南大学计算机科学与工程系,南京,210096
基金项目:国家自然科学基金 (79970 0 92 )
摘    要:分布式环境中,全局频繁项目集的挖掘是数据挖掘中最重要的研究课题之一.传统的全局频繁项目集挖掘算法采用Apriori算法框架,须多遍扫描数据库并产生大量的候选项目集,且通过传送局部频繁项目集求全局频繁项目集的网络通信代价高.为此,提出了一种分布数据库的全局频繁项目集快速挖掘算法——FMAGF.FMAGF算法采用传送条件频繁模式树或条件模式基来挖掘全局频繁项目集,可有效地减小网络通信量,提高全局频繁项目集挖掘效率.理论分析和实验结果表明提出的算法是有效可行的.

关 键 词:数据挖掘  全局频繁项目集  频繁模式树  快速挖掘算法  布尔型关联规则  数据库  Apriori算法

Fast Mining of Global Frequent Itemsets
YANG Ming,SUN Zhi Hui,and JI Gen Lin.Fast Mining of Global Frequent Itemsets[J].Journal of Computer Research and Development,2003,40(4):620-626.
Authors:YANG Ming  SUN Zhi Hui  and JI Gen Lin
Abstract:Fast mining of global frequent itemsets is an important data mining problem in a distributed database environment Conventional mining algorithms employ the same framework as Apriori for global frequent itemsets However, candidate set generation is still costly, and the algorithms need repeatedly scan the database, especially when there exist prolific patterns and/or long patterns And communication overhead is costly by transmitting local frequent itemsets for global frequent itemsets In this paper, an algorithm FMAGF(fast mining algorithm of global frequent itmesets) in distributed database is proposed The idea of FMAGF is to only transmit conditional frequent pattern trees or conditional pattern bases but not to transmit a lot of local frequent itemsets; therefore, the algorithm uses far less communication overhead and improves efficiency of mining global frequent itemsets Theory analysis and experimental results show the feasibility and effectiveness of the algorithm
Keywords:data mining  distributed database  global frequent itemsets  frequent pattern tree (FP  tree)
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号