首页 | 本学科首页   官方微博 | 高级检索  
     

基于优化上界的高平均效用项集垂直挖掘算法
引用本文:浦蓉,邵剑飞,胡常礼,曲坤.基于优化上界的高平均效用项集垂直挖掘算法[J].计算机工程与科学,2020,42(5):931-937.
作者姓名:浦蓉  邵剑飞  胡常礼  曲坤
作者单位:(昆明理工大学信息工程与自动化学院,云南 昆明 650500)
摘    要:高平均效用项集挖掘是当前研究的热点之一。针对高平均效用项集挖掘算法产生大量无意义的候选项集,而导致高内存消耗和运行时间长的问题,提出了dMHAUI算法。首先定义了集成矩阵Q,并提出了4种基于垂直数据库表示的紧凑平均效用上界及3种有效的修剪策略;将高平均效用项集挖掘所需的信息存储于IDUL结构树,利用改进的diffset技术快速计算项集的平均效用和上界;最后通过递归调用搜索函数得到高平均效用项集。与EHAUPM算法和MHAI算法进行仿真比较,结果表明,dMHAUI算法在运行时间、连接比较次数和可扩展性等方面都有较优的性能。

关 键 词:模式挖掘  高平均效用项集挖掘  dMHAUI算法  上界  效用挖掘
收稿时间:2019-10-21
修稿时间:2019-12-11

A vertical mining algorithm for high average-utility itemsets based on optimal upper bound
PU Rong,SHAO Jian-fei,HU Chang-li,QU Kun.A vertical mining algorithm for high average-utility itemsets based on optimal upper bound[J].Computer Engineering & Science,2020,42(5):931-937.
Authors:PU Rong  SHAO Jian-fei  HU Chang-li  QU Kun
Affiliation:(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China)
Abstract:Mining high average-utility itemsets is one of the hotspots in the current research. Aiming at the problem that the high average-utility itemsets mining algorithm generates a large number of meaningless candidate itemsets, which results in high memory consumption and long running time, the dMHAUI algorithm is proposed. Firstly, the algorithm defines the integration matrix Q, and proposes four compact average-utility upper bounds based on vertical database representation and three effective pru- ning strategies. Secondly, the information needed for high average-utility itemsets mining is stored in the IDUL structure tree, and the improved diffset technique is used to quickly calculate the average- utility and upper bound of itemsets. Finally, the high average-utility itemsets are obtained by recursively calling the search algorithm. Simulation results show that the dMHAUI function has better performance than the EHAUPM algorithm and the MHAI algorithm in terms of running time, join operation number and scalability.
Keywords:pattern mining  high average-utility itemsets mining  dMHAUI algorithm  upper bound  utility mining  
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《计算机工程与科学》浏览原始摘要信息
点击此处可从《计算机工程与科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号