首页 | 官方网站   微博 | 高级检索  
     

基于排序FP-树的频繁模式高效挖掘算法
引用本文:秦亮曦,李谦,史忠植.基于排序FP-树的频繁模式高效挖掘算法[J].计算机科学,2005,32(4):31-33.
作者姓名:秦亮曦  李谦  史忠植
作者单位:1. 中国科学院计算技术研究所智能信息处理重点实验室,北京,100080;中国科学院研究生院,北京,100039;广西大学计算机与信息工程学院,南宁,530004
2. 中国科学院研究生院,北京,100039
3. 中国科学院计算技术研究所智能信息处理重点实验室,北京,100080
基金项目:国家自然科学基金(90104021,60173017)
摘    要:FP-growth算法是目前较高效的频繁模式挖掘算法之一。在FP-growth算法中,FP-树及条件FP-树的构造和遍历占了算法绝大部分的时间,如果能减少这方面的时间,则有望进一步改善算法的效率。本文给出了一个频繁模式挖掘算法SFP-growth。算法通过将FP-树有序化及采用高效排序算法等措施来提高FP-树构造的效率,从而使算法达到较高的效率。实验结果表明,SFP-growth是一个高效的频繁模式挖掘算法,其性能优于Apriori、Eclat和FP-growtn算法。

关 键 词:数据挖掘  关联规则  频繁模式  排序FP-树

An Efficient Frequent Patterns Mining Algorithm Based on Sorted FP-Tree
QIN Liang-Xi,LI Qian,SHI Zhong-Zhi.An Efficient Frequent Patterns Mining Algorithm Based on Sorted FP-Tree[J].Computer Science,2005,32(4):31-33.
Authors:QIN Liang-Xi  LI Qian  SHI Zhong-Zhi
Affiliation:QIN Liang-Xi,LI Qian,SHI Zhong-Zhi Key Lab of Intelligent Information Processing,Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100080 Graduate School of Chinese Academy of Sciences. Beijing 100039 College of Computer and Information Engineering,Guangxi University,Nanning 530004
Abstract:FP-growth is a high performance algorithm for mining frequent patterns. In FP-growth algorithm, it costs most of the time in constructing and traversing the FP-tree and conditional FP-tree. If we can reduce the time con- suming in tree construction and traversing, then the performance can be improved. In this paper, an improved algo- rithm, SFP-growth, is presented. The algorithm adopts sorted FP-trees to store the main information of the transac- tions. It also uses an efficient sorting algorithm and other techniques in the construction of trees. The experimental result shows that SFP-growth is an efficient algorithm, it outperforms Apriori, Eclat and FP-growth algorithm.
Keywords:Data mining  Association rules  Frequent patterns  Sorted FP-tree  
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号