首页 | 本学科首页   官方微博 | 高级检索  
     


Detecting Group Differences: Mining Contrast Sets
Authors:Stephen D Bay  Michael J Pazzani
Affiliation:(1) Department of Information and Computer Science, University of California, Irvine, CA 92697, USA;(2) Department of Information and Computer Science, University of California, Irvine, CA 92697, USA
Abstract:A fundamental task in data analysis is understanding the differences between several contrasting groups. These groups can represent different classes of objects, such as male or female students, or the same group over time, e.g. freshman students in 1993 through 1998. We present the problem of mining contrast sets: conjunctions of attributes and values that differ meaningfully in their distribution across groups. We provide a search algorithm for mining contrast sets with pruning rules that drastically reduce the computational complexity. Once the contrast sets are found, we post-process the results to present a subset that are surprising to the user given what we have already shown. We explicitly control the probability of Type I error (false positives) and guarantee a maximum error rate for the entire analysis by using Bonferroni corrections.
Keywords:data mining  contrast sets  change detection  association rules
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号