首页 | 本学科首页   官方微博 | 高级检索  
     

交易数据库中关联模式兴趣度的统计度量
引用本文:徐勇,朱其祥.交易数据库中关联模式兴趣度的统计度量[J].现代计算机,2005(11):21-24.
作者姓名:徐勇  朱其祥
作者单位:安徽财经大学信息工程学院,蚌埠233041
基金项目:安徽省高校自然科学基金
摘    要:关联模式挖掘研究是数据挖掘研究领域的重要分支之一,旨在发现项集之间存在的关联或相关关系.然而,传统的基于支持度-可信度框架的挖掘方法存在着一些不足:一是会产生过多的模式(包括频繁项集和规则);二是挖掘出来的规则有些是用户不感兴趣的、无用的,甚至是错误的.所以,在挖掘过程中有效地对无用模式进行剪枝是必要的.将卡方分析引入到模式的相关性度量中,利用卡方检验对项集之间、规则前件与后件之间的相关性进行度量是一种有效的剪枝方法.结果分析表明,在支持度度量的基础上引入卡方检验可以有效地对非相关模式进行剪枝,从而缩小频繁项集和规则的规模.

关 键 词:数据挖掘  关联模式  卡方检验
收稿时间:2005-06-27
修稿时间:2005-06-27

Statistical Measure on Interesting of Association Patterns in Retail Database
XU Yong,ZHU Qi-xiang.Statistical Measure on Interesting of Association Patterns in Retail Database[J].Modem Computer,2005(11):21-24.
Authors:XU Yong  ZHU Qi-xiang
Abstract:Association patterns mining is one of the important task of research on data mining, which main purpose is finding the correlations between the items. However, there are some shortcomings while using the common approach based on support-confidence framework to capture association patterns. First, there are a great number of redundant association rules generated; second, some of patterns generated are unwanted, even are misleading. So it is necessary to prune such uninteresting patterns. Chi-Squared test is introduced to prune the irrelevant items via calculating the Chi-Squared value of items. The experiment shows that Chi-Squared test is efficient and the searching space of the algorithm has been reduced remarkably.
Keywords:Data Mining  Association Patterns  Chi-Squared Test
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号