首页 | 本学科首页   官方微博 | 高级检索  
     

基于双阈值Apriori算法和非频繁项集的关联规则挖掘方法
引用本文:阮梦黎,吴磊.基于双阈值Apriori算法和非频繁项集的关联规则挖掘方法[J].计算机应用研究,2018,35(12).
作者姓名:阮梦黎  吴磊
作者单位:山东管理学院 信息工程学院,山东师范大学 信息科学与工程学院
基金项目:国家自然科学基金(No.61602287);山东省高等学校科技计划项目(No.J16LN70)
摘    要:针对从本文数据集中的正负关联规则挖掘问题,提出一种基于双阈值Apriori算法和非频繁项集的挖掘方法。首先,对通过逆文档频率(IDF)对语料库中的项(项集)进行加权,筛选出前N%的项集。然后,通过提出的双支持度阈值Apriori算法来提取频繁项集和非频繁项集,以此降低非频繁项集的数量。最后,通过置信度和升降度阈值的判断,分别从频繁项集和非频繁项集中挖掘正负关联规则。其中,创新性的利用了非频繁项集来挖掘正负关联规则。在一个医学文本数据集上的实验结果表明,提出的方法能够有效挖掘出正负关联规则,且能够大大降低项集和规则数量。

关 键 词:正负关联规则挖掘  双支持度阈值  Apriori算法  非频繁项集  IDF加权
收稿时间:2017/11/28 0:00:00
修稿时间:2018/11/5 0:00:00

An Association Rule Mining Method Based on Double Threshold Apriori Algorithm and infrequent Itemsets
RUAN Meng-li and WU Lei.An Association Rule Mining Method Based on Double Threshold Apriori Algorithm and infrequent Itemsets[J].Application Research of Computers,2018,35(12).
Authors:RUAN Meng-li and WU Lei
Abstract:For the issues that mining positive and negative association rules from the dataset in this paper, a mining method based on double threshold Apriori algorithm and infrequent itemsets is proposed. Firstly, the items (terms) in the corpus are weighted by the inverse document frequency (IDF) to filter out the top N% of the itemsets. Then, the frequent itemsets and the non-frequent itemsets are extracted through the proposed double support threshold Apriori algorithm, to reduce the number of infrequent itemsets. Finally, the positive and negative association rules are excavated respectively from the frequent itemsets and the infrequent itemsets through the judgment of the confidence level and lifting. Among them, it innovative used of infrequent itemsets to mining positive and negative association rules. The experimental results on a medical text dataset show that the proposed method can effectively mine the positive and negative association rules and can greatly reduce the number of itemsets and rules.
Keywords:Positive and negative association rule mining  Double support threshold  Apriori algorithm  Infrequent itemsets  IDF weighting
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号