首页 | 本学科首页   官方微博 | 高级检索  
     

一种基于压缩前缀树的频繁模式挖掘算法
引用本文:郭云峰,张集祥. 一种基于压缩前缀树的频繁模式挖掘算法[J]. 计算机工程与科学, 2009, 31(12). DOI: 10.3969/j.issn.1007-130X.2009.12.021
作者姓名:郭云峰  张集祥
作者单位:杭州电子科技大学图形图像研究所,浙江,杭州,310018;杭州电子科技大学图形图像研究所,浙江,杭州,310018
摘    要:针对FP-growth算法存在动态维护复杂、在挖掘过程中需要递归地创建大量的条件频繁模式树,导致时空效率不高等不足,本算法在压缩前缀树的基础上,通过调整树中节点信息和节点链,采用深度优先的策略挖掘频繁模式,无需任何附加的数据结构,极大地减少了系统资源的消耗,减少树的规模和遍历次数,挖掘效率大大提高。

关 键 词:频繁模式  压缩前缀树  频繁项集

A Mining Algorithm for Frequent Patterns Based on Compressed Prefix-Tree
Abstract:The FP-growth algorithm achieves better performance and efficiency than the Apriori-like algorithms because of avoiding costly candidate generation, but it still suffers from creating conditional FP-trees separately and recursively during the mining process, so its efficiency in time and space is not idear. In this paper, we propose a new algorithm CPM that designs a new tree structure called compressed prefix-tree, which stores all of the information in a highly compact form, decreases the consumption of system resources greatly. CPM mines frequent patterns in a depth-first order and directly in compressed prefix-trees by adjusting the node information and node links without using any additional data structures. Thus, it improves performance greatly.
Keywords:frequent pattern  compressed prefix-tree  frequent itermset
本文献已被 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号