首页 | 本学科首页   官方微博 | 高级检索  
     

基于混合投影的频繁模式挖掘算法
引用本文:刘君强,潘云鹤. 基于混合投影的频繁模式挖掘算法[J]. 计算机研究与发展, 2003, 40(10): 1488-1498
作者姓名:刘君强  潘云鹤
作者单位:1. 杭州商学院计算机科学系,杭州,310035;浙江大学人工智能研究所,杭州,310027
2. 浙江大学人工智能研究所,杭州,310027
基金项目:国家“八六三”高技术研究发展计划 ( 2 0 0 2AA12 10 64 ),浙江省自然科学基金 ( 60 2 14 0 ),浙江省教育厅科技计划项目基金(2 0 0 2 0 63 5 )
摘    要:频繁模式挖掘是最基本的数据挖掘问题,由于内在复杂性,提高挖掘算法性能一直是个难题.耶是通过数据库混合投影来挖掘频繁模式完全集的全新算法.HP混合投影思想是:任意数据集都不能简单地归入某个单一特性类别,挖掘过程应根据局部数据子集的特性变化动态地调整频繁模式树构造策略、事务子集表示形式、投影方法.HP提出基于树表示的虚拟投影与基于数组表示的非过滤投影,较好地解决了提高时间效率与节省内存空间的矛盾.实验表明,HP时间效率比Apriori,FP—Growth和H-Mine高出1~3个数量级,并且空间可伸缩性也大大优于这些算法.

关 键 词:知识发现 数据挖掘 频繁模式

Mining Frequent Patterns Based on Hybrid Projection
LIU Jun Qiang , and PAN Yun He. Mining Frequent Patterns Based on Hybrid Projection[J]. Journal of Computer Research and Development, 2003, 40(10): 1488-1498
Authors:LIU Jun Qiang      PAN Yun He
Affiliation:LIU Jun Qiang 1,2 and PAN Yun He 2 1
Abstract:Frequent pattern mining is a fundamental data mining problem for which algorithms still suffer from inefficiencies because of the inherent complexities The new algorithm HP presented in this paper discovers frequent patterns by employing hybrid projections of datasets to grow a frequent pattern tree The basic idea is that any dataset cannot be simply classified as dense or sparse one, so the mining algorithm should dynamically adjust its frequent pattern tree search strategies, representations of transaction subsets, projection methods according to features of the local subsets Also proposed in HP are the tree based pseudo projection and array based unfiltered projection that resolves the contradiction between time complexity and space complexity Comparative experiments show that HP is one to three orders of magnitude more efficient than Apriori, FP Growth and H Mine, but also more scalable than other algorithms
Keywords:knowledge discovery  data mining  frequent patterns  
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号