首页 | 本学科首页   官方微博 | 高级检索  
     

基于节点集Top-k频繁模式挖掘算法
引用本文:孙 俊,张曦煌. 基于节点集Top-k频繁模式挖掘算法[J]. 计算机工程与应用, 2017, 53(6): 101-105. DOI: 10.3778/j.issn.1002-8331.1508-0158
作者姓名:孙 俊  张曦煌
作者单位:江南大学 物联网工程学院,江苏 无锡 214122
摘    要:频繁模式挖掘的模式数量通常过于巨大,在实际应用中只有少量的频繁模式被使用。Top-k频繁模式挖掘通过排列模式频数限制频繁模式的数量,有效提高了算法效率。提出了TPN(Top-k-Patterns based on Nodesets)算法,该算法使用了节点集的概念,将数据压缩于Poc-tree,通过Top-k-rank表重新计算最小支持度限制生成候选模式的数量。实验通过与ATFP,Top-k-FP-growth算法比较,证明该算法有较好的效率。

关 键 词:数据挖掘  top-k  频繁模式  节点集  

Top-k frequent patterns based on nodesets
SUN Jun,ZHANG Xihuang. Top-k frequent patterns based on nodesets[J]. Computer Engineering and Applications, 2017, 53(6): 101-105. DOI: 10.3778/j.issn.1002-8331.1508-0158
Authors:SUN Jun  ZHANG Xihuang
Affiliation:School of Internet of Things Engineering, Jiangnan University, Wuxi, Jiangsu 214122, China
Abstract:The number of mined patterns is usually too large and a small number of frequent patterns are used in real application. Therefore, the mining of top-rank-k frequent patterns which limits the number of mined frequent patterns by ranking them in frequency, has improved the efficiency of the algorithm. This paper proposes the TPN algorithm for mining top-k frequent patterns. The TPN employs a new data structure, Nodesets, to represent patterns, compressing the data to Poc-tree and computing min support patterns to limit candidate items by the top-k-rank table. The experiments are conducted to evaluate TPN and ATFP, Top-k-FP-growth in terms of mining time for two datasets. The experimental results show that TPN is more efficient and faster.
Keywords:data mining  top-k  frequent patterns  nodesets  
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号