首页 | 本学科首页   官方微博 | 高级检索  
     


ETARM: an efficient top-k association rule mining algorithm
Authors:Linh T T Nguyen  Bay Vo  Loan T T Nguyen  Philippe Fournier-Viger  Ali Selamat
Affiliation:1.Faculty of Information Technology,Dong An Polytechnic,Binh Duong,Vietnam;2.Faculty of Information Technology,Ho Chi Minh City University of Technology,Ho Chi Minh City,Vietnam;3.Division of Knowledge and System Engineering for ICT,Ton Duc Thang University,Ho Chi Minh City,Vietnam;4.Faculty of Information Technology,Ton Duc Thang University,Ho Chi Minh City,Vietnam;5.School of Humanities and Social Sciences,Harbin Institute of Technology Shenzhen Graduate School,Shenzhen,China;6.Universiti Teknologi Malaysia,Johor,Malaysia
Abstract:Mining association rules plays an important role in data mining and knowledge discovery since it can reveal strong associations between items in databases. Nevertheless, an important problem with traditional association rule mining methods is that they can generate a huge amount of association rules depending on how parameters are set. However, users are often only interested in finding the strongest rules, and do not want to go through a large amount of rules or wait for these rules to be generated. To address those needs, algorithms have been proposed to mine the top-k association rules in databases, where users can directly set a parameter k to obtain the k most frequent rules. However, a major issue with these techniques is that they remain very costly in terms of execution time and memory. To address this issue, this paper presents a novel algorithm named ETARM (Efficient Top-k Association Rule Miner) to efficiently find the complete set of top-k association rules. The proposed algorithm integrates two novel candidate pruning properties to more effectively reduce the search space. These properties are applied during the candidate selection process to identify items that should not be used to expand a rule based on its confidence, to reduce the number of candidates. An extensive experimental evaluation on six standard benchmark datasets show that the proposed approach outperforms the state-of-the-art TopKRules algorithm both in terms of runtime and memory usage.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号