首页 | 本学科首页   官方微博 | 高级检索  
     

一种直接在Trans-树中挖掘频繁模式的新算法
引用本文:范明,王秉政.一种直接在Trans-树中挖掘频繁模式的新算法[J].计算机科学,2003,30(8):117-120.
作者姓名:范明  王秉政
作者单位:郑州大学计算机科学系,郑州,450052
基金项目:河南省自然科学基金(项目号:0111060700,0211050100)
摘    要:Frequent pattern mining plays an essential role in many important data mining tasks. FP-growth is a very efficient algorithm for frequent pattern mining. However, it still suffers from creating conditional FP-tree separately and recursively during the mining process. In this paper, we propose a new algorithm, called Least-Item-First Pat-tern Growth (LIFPG), for mining frequent patterns. LIFPG mines frequent patterns directly in Trans-tree withoutusing any additional data structures. The key idea is that least items are always considered first when the current pat-tern growth. By this way, conditional sub-tree can be created directly in Trans-tree by adjusting node-links and re-counting counts of some nodes. Experiments show that, in comparison with FP-Growth, our algorithm is about fourtimes faster and saves half of memory;it also has good time and space scalability with the number of transactions,and has an excellent performance in dense dataset mining as well.

关 键 词:频繁模式  关联规则  数据库  Trans-树  数据挖掘  算法

A Novel Algorithm for Mining Frequent Patterns Directly in Trans-Tree
FAN Ming WANG Bing-Zheng.A Novel Algorithm for Mining Frequent Patterns Directly in Trans-Tree[J].Computer Science,2003,30(8):117-120.
Authors:FAN Ming WANG Bing-Zheng
Abstract:Frequent pattern mining plays an essential role in many important data mining tasks. FP-growth is a very efficient algorithm for frequent pattern mining. However, it still suffers from creating conditional FP-tree separately and recursively during the mining process. In this paper, we propose a new algorithm, called Least-Item-First Pattern Growth (LIFPG), for mining frequent patterns. LIFPG mines frequent patterns directly in Trans-tree without using any additional data structures. The key idea is that least items are always considered first when the current pattern growth. By this way, conditional sub-tree can be created directly in Trans-tree by adjusting node-links and recounting counts of some nodes. Experiments show that, in comparison with FP-Growth, our algorithm is about four times faster and saves half of memory; it also has good time and space scalability with the number of transactions, and has an excellent performance in dense dataset mining as well.
Keywords:Data mining  Frequent pattern  Association rule  
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号