首页 | 本学科首页   官方微博 | 高级检索  
     

基于有限个条件FP_树中挖掘频繁模式
引用本文:林丽,冯少荣,薛永生.基于有限个条件FP_树中挖掘频繁模式[J].计算机工程与应用,2007,43(5):175-177.
作者姓名:林丽  冯少荣  薛永生
作者单位:厦门大学计算机系数据库实验室 厦门大学计算机系 厦门大学计算机系
基金项目:福建省自然科学基金 , 福建省高新技术项目
摘    要:在数据挖掘中发现关联规则是一个基本问题,而关联规则发现中最昂贵的步骤便是寻找频繁模式。FP_growth(frequent-patern growth)方法在产生长短频繁项集时不产生候选项集,从而大大提高了挖掘的效率,但是FP_growth在挖掘频繁模式时候产生大量的条件FP树从而占用大量空间,对FP_growth进行研究提出一种改进算法不仅利用FP_growth 算法所有优点,而且避免FP_growth的缺陷。主要通过建立有限棵条件FP树(数目为事务数据库的属性个数)来挖据长短频繁模式,大大节省FP_growth算法所需要空间,实验证明本文算法是有效的。

关 键 词:关联规则  FP_growth  频繁模式  条件FP树
文章编号:1002-8331(2007)05-0175-03
收稿时间:2006-03-14
修稿时间:2006-06

Mining frequent item sets from several conditional FP_trees
LIN Li,FENG Shao-rong,XUE Yong-sheng.Mining frequent item sets from several conditional FP_trees[J].Computer Engineering and Applications,2007,43(5):175-177.
Authors:LIN Li  FENG Shao-rong  XUE Yong-sheng
Affiliation:Department of Computer Science,Xiamen University,Xiamen,Fujian 361005,China
Abstract:Discovering association rules is a basic problem in data mining.Finding frequent item sets is the most expensive step in association rule discovery.Analysing a frequent pattern growth(FP-growth) method is effieient for mining both long and short frequent patterns without candidate generation,but FP_growth would generate a huge number of conditional FP-trees and then occupied memory space,so proposing a new efficient algorithm not only heirs all the advantages in FP-growth method,but also avoids its bottleneck.By establishing several conditional FP_trees(the number is equal the number of database's items) to mine long and short frequent item sets,the improved algorithm could save memory space significantly.Performance study also shows that the improved method is efficient.
Keywords:association rules  FP_growth  frequent item sets  conditional FP_tree
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号