首页 | 本学科首页   官方微博 | 高级检索  
     


Beyond Market Baskets: Generalizing Association Rules to Dependence Rules
Authors:Craig Silverstein  Sergey Brin  Rajeev Motwani
Affiliation:(1) Department of Computer Science, Stanford University, Stanford, CA, 94305
Abstract:One of the more well-studied problems in data mining is the search for association rules in market basket data. Association rules are intended to identify patterns of the type: ldquoA customer purchasing item A often also purchases item B.rdquo Motivated partly by the goal of generalizing beyond market basket data and partly by the goal of ironing out some problems in the definition of association rules, we develop the notion of dependence rules that identify statistical dependence in both the presence and absence of items in itemsets. We propose measuring significance of dependence via the chi-squared test for independence from classical statistics. This leads to a measure that is upward-closed in the itemset lattice, enabling us to reduce the mining problem to the search for a border between dependent and independent itemsets in the lattice. We develop pruning strategies based on the closure property and thereby devise an efficient algorithm for discovering dependence rules. We demonstrate our algorithm's effectiveness by testing it on census data, text data (wherein we seek term dependence), and synthetic data.
Keywords:data mining  market basket  association rules  dependence rules  closure properties  text mining
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号