首页 | 本学科首页   官方微博 | 高级检索  
     

一种基于FP-Growth的频繁项目集并行挖掘算法
引用本文:章志刚,吉根林.一种基于FP-Growth的频繁项目集并行挖掘算法[J].计算机工程与应用,2014,50(2):103-106.
作者姓名:章志刚  吉根林
作者单位:南京师范大学 计算机科学与技术学院,南京 210023
基金项目:江苏省自然科学基金重点项目(No.BK2011005).
摘    要:FP-Growth算法是基于FP树挖掘频繁项目集的经典算法,为提高FP-Growth算法挖掘大规模数据频繁项目集的效率,提出了一种基于FP-Growth的频繁项目集并行挖掘算法FPPM。该算法基于Map/Reduce并行模型,在每个计算节点上首先构造局部频繁模式树,并对之进行挖掘得到局部频繁项目集,然后合并局部频繁项目集以得到全局频繁项集,由于此时得到的结果并不完备,所以对合并后未达到最小支持度阈值的项目集,重新计算其支持数。介绍了FPPM算法的设计思想,测试了其性能。实验结果表明FPPM算法具有较好的可扩展性。

关 键 词:频繁项目集  并行挖掘  FP-Growth  Map/Reduce  

Parallel algorithm for mining frequent item sets based on FP-Growth
ZHANG Zhigang,JI Genlin.Parallel algorithm for mining frequent item sets based on FP-Growth[J].Computer Engineering and Applications,2014,50(2):103-106.
Authors:ZHANG Zhigang  JI Genlin
Affiliation:School of Computer Science and Technology, Nanjing Normal University, Nanjing 210023, China
Abstract:Algorithm FP-Growth is a classic algorithm for mining frequent item sets which is based on frequent pattern tree. In order to improve the efficiency of algorithm FP-Growth for mining association rules from massive datasets, parallel FP-Growth algorithm FPPM is presented. The algorithm is based on Map/Reduce model, and the local frequent pattern tree of each computing node is built, these local trees are mined to get local frequent item sets, and local frequent item sets are merged into global frequent item sets. After the statistics of the local frequent item sets, a complete result is got. In this paper, the idea of FPPM is introduced and its performance is studied. The experimental results show that the parallel algo- rithm FPPM has high scalability.
Keywords:frequent item set  parallel mining  FP-Growth  Map/Reduce
本文献已被 CNKI 维普 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号