一种基于压缩前缀树的频繁模式挖掘算法 |
| |
引用本文: | 郭云峰,张集祥. 一种基于压缩前缀树的频繁模式挖掘算法[J]. 计算机工程与科学, 2009, 31(12). DOI: 10.3969/j.issn.1007-130X.2009.12.021 |
| |
作者姓名: | 郭云峰 张集祥 |
| |
作者单位: | 杭州电子科技大学图形图像研究所,浙江,杭州,310018;杭州电子科技大学图形图像研究所,浙江,杭州,310018 |
| |
摘 要: | 针对FP-growth算法存在动态维护复杂、在挖掘过程中需要递归地创建大量的条件频繁模式树,导致时空效率不高等不足,本算法在压缩前缀树的基础上,通过调整树中节点信息和节点链,采用深度优先的策略挖掘频繁模式,无需任何附加的数据结构,极大地减少了系统资源的消耗,减少树的规模和遍历次数,挖掘效率大大提高。
|
关 键 词: | 频繁模式 压缩前缀树 频繁项集 |
A Mining Algorithm for Frequent Patterns Based on Compressed Prefix-Tree |
| |
Abstract: | The FP-growth algorithm achieves better performance and efficiency than the Apriori-like algorithms because of avoiding costly candidate generation, but it still suffers from creating conditional FP-trees separately and recursively during the mining process, so its efficiency in time and space is not idear. In this paper, we propose a new algorithm CPM that designs a new tree structure called compressed prefix-tree, which stores all of the information in a highly compact form, decreases the consumption of system resources greatly. CPM mines frequent patterns in a depth-first order and directly in compressed prefix-trees by adjusting the node information and node links without using any additional data structures. Thus, it improves performance greatly. |
| |
Keywords: | frequent pattern compressed prefix-tree frequent itermset |
本文献已被 万方数据 等数据库收录! |
|