首页 | 本学科首页   官方微博 | 高级检索  
     


High-utility and diverse itemset mining
Authors:Verma  Amit  Dawar  Siddharth  Kumar  Raman  Navathe  Shamkant  Goyal  Vikram
Affiliation:1.Department of Computer Science and Engineering, IKGPTU, Kapurthala, Punjab, India
;2.Department of Computer Science, IIIT-Delhi, New Delhi, India
;3.College Of Computing, Georgia Institute of Technology, Atlanta, Georgia, USA
;
Abstract:

High-utility Itemset Mining (HUIM) finds patterns from a transaction database with their utility no less than a user-defined threshold. The utility of an itemset is defined as the sum of the utilities of its items. The utility notion enables a data analyst to associate a profit score with each item and thereof to a pattern. We extend the notion of high-utility with diversity to define a new pattern type called High-utility and Diverse pattern (HUD). The notion of diversity of a pattern captures the extent of the different categories covered by the selected items in the pattern. An application of diverse-pattern lies in the recommendation task where a system can recommend to a customer a set of items from a new class based on her previously bought items. Our notion of diversity is easy to compute and also captures the basic essence of a previously proposed diversity notion. The existing algorithm to compute frequent-diverse patterns is 2-phase, i.e., in the first phase, frequent patterns are computed, out of which diverse patterns are filtered out in the second phase. We, in this paper, give an integrated algorithm that efficiently computes high-utility and diverse patterns in a single phase. Our experimental study shows that our proposed algorithm is very efficient as compared to a 2-phase algorithm that extracts high-utility itemsets in the first phase and filters out the diverse itemsets in the second phase.

Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号