首页 | 本学科首页   官方微博 | 高级检索  
     

基于B-list的快速频繁模式挖掘算法
引用本文:李校林,杜托,刘彪. 基于B-list的快速频繁模式挖掘算法[J]. 计算机应用, 2017, 37(8): 2357-2361. DOI: 10.11772/j.issn.1001-9081.2017.08.2357
作者姓名:李校林  杜托  刘彪
作者单位:1. 重庆邮电大学 通信新技术应用研究中心, 重庆 400065;2. 重庆信科设计有限公司, 重庆 400065
基金项目:重庆市研究生科研创新基金资助项目(CYS15166)。
摘    要:针对现有的频繁模式挖掘算法存在建树复杂、挖掘效率低等问题,提出一种基于构造链表(B-list)的频繁模式挖掘(BLFPM)算法。BLFPM使用一种新的数据结构B-list表示频繁项集,通过连接两个k-1-频繁项集的B-list可以快速得到k-项集的支持度,避免了多次扫描数据库;针对连接两个B-list时间复杂度高的问题,给出了一种线性时间复杂度的连接方法,提高了BLFPM的时间效率;同时,BLFPM采用集合枚举树代表搜索空间,并使用子集非频繁剪枝策略,减小了频繁模式挖掘的搜索空间,提高了算法的执行速度。实验结果表明,与NSFI算法和prepost算法相比,BLFPM的时间效率提高约12%到29%,空间效率提高约10%到24%,对稀疏数据库或稠密数据库进行频繁模式挖掘均可以得到良好的效果。

关 键 词:数据挖掘  模式挖掘  频繁项集  遍历构造树  构造链表  
收稿时间:2016-12-13
修稿时间:2017-03-03

Fast algorithm for mining frequent patterns based on B-list
LI Xiaolin,DU Tuo,LIU Biao. Fast algorithm for mining frequent patterns based on B-list[J]. Journal of Computer Applications, 2017, 37(8): 2357-2361. DOI: 10.11772/j.issn.1001-9081.2017.08.2357
Authors:LI Xiaolin  DU Tuo  LIU Biao
Affiliation:1. Institute of Applied Communication Technology, Chongqing University of Posts and Telecommunications, Chongqing 400065, China;2. Chongqing Information Technology Designing Company Limited, Chongqing 400065, China
Abstract:In order to solve the problems in the existing frequent pattern mining algorithms, such as complex tree building and low mining efficiency, a high-performance algorithm for mining frequent patterns was proposed, namely B-List Frequent Pattern Mining (BLFPM). A new data structure of Building list (B-list) was employed to represent frequent itemsets, and the frequent patterns were directly discovered by intersecting two B-lists without scanning the database. Aiming at the high time complexity of connecting two B-lists, a linear time complexity connection method was proposed to improve the time efficiency of BLFPM. Besides, set-enumeration search tree and an efficient pruning strategy were adopted to greatly reduce the search space and speed up the execution. Compared to N-list and Subsume-based algorithm for mining Frequent Itemsets (NSFI) and prepost algorithm, the time efficiency of BLFPM was improved by about 12% to 29%, and the space efficiency of BLFPM was improved by about 10% to 24%. The experimental results show that BLFPM has good performance for both sparse database and intensive database.
Keywords:data mining   pattern mining   frequent itemset   Traversal when Building tree (TB-tree)   Building list (B-list)
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号