首页 | 本学科首页   官方微博 | 高级检索  
     


Mining itemset utilities from transaction databases
Affiliation:1. School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen Graduate School, HIT Campus, Shenzhen University, Town Xili, Shenzhen, Chinan;2. School of Natural Sciences and Humanities, Harbin Institute of Technology, Shenzhen Graduate School, HIT Campus, Shenzhen University, Town Xili, Shenzhen, China;3. Department of Computer Science and Information Engineering, National University of Kaohsiung, Kaohsiung, Taiwan;4. Department of Computer Science and Engineering, National Sun Yat-sen University, Kaohsiung, Taiwan;5. Department of Computer Science, National Chiao Tung University, Hsinchu, Taiwan;1. School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, China;2. School of Natural Sciences and Humanities, Harbin Institute of Technology Shenzhen Graduate School, HIT Campus Shenzhen University Town, Xili, Shenzhen 518055 China
Abstract:The rationale behind mining frequent itemsets is that only itemsets with high frequency are of interest to users. However, the practical usefulness of frequent itemsets is limited by the significance of the discovered itemsets. A frequent itemset only reflects the statistical correlation between items, and it does not reflect the semantic significance of the items. In this paper, we propose a utility based itemset mining approach to overcome this limitation. The proposed approach permits users to quantify their preferences concerning the usefulness of itemsets using utility values. The usefulness of an itemset is characterized as a utility constraint. That is, an itemset is interesting to the user only if it satisfies a given utility constraint. We show that the pruning strategies used in previous itemset mining approaches cannot be applied to utility constraints. In response, we identify several mathematical properties of utility constraints. Then, two novel pruning strategies are designed. Two algorithms for utility based itemset mining are developed by incorporating these pruning strategies. The algorithms are evaluated by applying them to synthetic and real world databases. Experimental results show that the proposed algorithms are effective on the databases tested.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号