首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 781 毫秒
1.
基于规则置信度调整的关联文本分类   总被引:1,自引:0,他引:1  
基于关联规则的文本分类方法ARC-BC是目前已知的分类效果最好的关联规则分类算法.本文提出了利用ARC-BC分类器的封闭测试的结果对分类器进行调整规则置信度的算法RCA(Rules Confidence Adjustment),参与正确分类行为次数多于参与错误分类行为次数(即"威信"较高)的规则应该拥有更高的置信度,反之,就赋予更低的置信度.实验结果表明,经过RCA算法调整的关联文本分类器的分类效果得到显著提高.  相似文献   

2.
霍纬纲  高小霞 《控制与决策》2012,27(12):1833-1838
提出一种适用于多类不平衡分布情形下的模糊关联分类方法,该方法以最小化AdaBoost.M1W集成学习迭代过程中训练样本的加权分类错误率和子分类器中模糊关联分类规则数目及规则中所含模糊项的数目为遗传优化目标,实现了AdaBoost.M1W和模糊关联分类建模过程的较好融合.通过5个多类不平衡UCI标准数据集和现有的针对不平衡分类问题的数据预处理方法实验对比结果,表明了所提出的方法能显著提高多类不平衡情形下的模糊关联分类模型的分类性能.  相似文献   

3.
张勇 《福建电脑》2007,(8):124-125
在本文中,我们通过对比现有的多标签分类和关联规则的分类,提出一种新的分类方法多类别多标签关联分类.与其他分类方法相比较.它有较强的竞争力并且是可伸缩的.  相似文献   

4.
研究了现有的关联分类算法在文本分类中的应用,发现对于有结构的文本数据,关联分类算法未考虑文本的语义信息导致分类精度不够理想,为此提出了基于规则重构的关联文本分类方法.该方法利用词共现模型,在已挖掘的分类规则基础上,将具有高共现程度的词对组合在一起进行规则重构,形成了有结构的带有文本语义信息的分类规则,再利用它们对新文本进行分类.实验结果表明,该方法在分类精度上优于其它的关联文本分类方法(ARC).  相似文献   

5.
基于排序的关联分类算法   总被引:1,自引:0,他引:1  
提出了一种基于排序的关联分类算法.利用基于规则的分类方法中择优方法偏爱高精度规则的思想和考虑尽可能多的规则,改进了CBA(Classification Based on Associations)只根据少数几条覆盖训练集的规则构造分类器的片面性.首先采用关联规则挖掘算法产生后件为类标号的关联规则,然后根据长度、置信度、支持度和提升度等对规则进行排序,并在排序时删除对分类结果没有影响的规则.排序后的规则加上一个默认分类便构成最终的分类器.选用20个UCI公共数据集的实验结果表明,提出的算法比CBA具有更高的平均分类精度.  相似文献   

6.
介绍了关联规则的基本概念,总结了关联规则的分类及各种挖掘算法,并对一些典型算法进行了介绍,最后展望了关联规则挖掘的下一步研究方向。  相似文献   

7.
介绍了关联规则的基本概念,总结了关联规则的分类及各种挖掘算法,并对一些典型算法进行了介绍,最后,展望了关联规则挖掘的下一步研究方向。  相似文献   

8.
基于模糊的关联分析是目前研究的新方向.基于关联分析的概念,从多目标线性规划问题算法出发,研究了Apriori的模糊关联分类方法,并提出改进的CFACA算法存储结构和流程设计.通过对移动通信客户成长性的模糊关联分析,得到了比较理想的关联规则.  相似文献   

9.
常璐璐  刘春霞 《福建电脑》2007,(9):37-37,19
论述了关联规则研究情况,给出了关联规则的概念与分类,分析和评价了关联规则的主要挖掘方法与维护方法,最后提出了关联规则研究的发展趋势。  相似文献   

10.
研究分析了现有关联规则分类算法,总结了一般关联规则分类存在的不足,提出了一个基于关联规则挖掘技术构造分类器的新方法。该方法解决了传统算法产生规则太多,分类模型难以理解的问题。  相似文献   

11.
关联分类通常产生大量的分类规则,导致在分类新实例时经常产生规则冲突问题。针对这种规则冲突问题,提出了一种基于改进关联分类的两次学习框架。利用频繁且互关联的项集产生分类规则改进关联分类算法,有效减少了规则数。应用改进的关联分类算法产生的一级规则一次性分离出训练集中规则冲突的所有实例。然后,在冲突实例上应用改进的关联分类算法进行第二次学习得到二级规则。分类新实例时,首先利用第一级规则进行分类。如果出现规则冲突,则利用第二级规则分类该实例。实验结果表明,基于改进关联分类的两次学习方法降低了规则冲突比率,并且显著提高了分类准确率。  相似文献   

12.
Associative classifiers are a classification system based on associative classification rules. Although associative classification is more accurate than a traditional classification approach, it cannot handle numerical data and its relationships. Therefore, an ongoing research problem is how to build associative classifiers from numerical data. In this work, we focus on stock trading data with many numerical technical indicators, and the classification problem is finding sell and buy signals from the technical indicators. This study proposes a GA-based algorithm used to build an associative classifier that can discover trading rules from these numerical indicators. The experiment results show that the proposed approach is an effective classification technique with high prediction accuracy and is highly competitive when compared with the data distribution method.  相似文献   

13.
基于信息增益的中文文本关联分类   总被引:1,自引:0,他引:1  
关联分类是一种通过挖掘训练集中的关联规则,并利用这些规则预测新数据类属性的分类技术。最近的研究表明,关联分类取得了比传统的分类方法如C4.5更高的准确率。现有的基于支持度-置信度架构的关联分类方法仅仅是选择频繁文字构建分类规则,忽略了文字的分类有效性。本文提出一种新的ACIG算法,结合信息增益与FoilGain在中文文本中选择规则的文字,以提高文字的分类有效性。实验结果表明,ACIG算法比其他关联分类算法(CPAR)有更高的准确率。  相似文献   

14.
Classification, a data mining technique, has widespread applications including medical diagnosis, targeted marketing, and others. Knowledge discovery from databases in the form of association rules is one of the important data mining tasks. An integrated approach, classification based on association rules, has drawn the attention of the data mining community over the last decade. While attention has been mainly focused on increasing classifier accuracies, not much efforts have been devoted towards building interpretable and less complex models. This paper discusses the development of a compact associative classification model using a hill-climbing approach and fuzzy sets. The proposed methodology builds the rule-base by selecting rules which contribute towards increasing training accuracy, thus balancing classification accuracy with the number of classification association rules. The results indicated that the proposed associative classification model can achieve competitive accuracies on benchmark datasets with continuous attributes and lend better interpretability, when compared with other rule-based systems.  相似文献   

15.
《Applied Soft Computing》2007,7(3):1102-1111
Classification and association rule discovery are important data mining tasks. Using association rule discovery to construct classification systems, also known as associative classification, is a promising approach. In this paper, a new associative classification technique, Ranked Multilabel Rule (RMR) algorithm is introduced, which generates rules with multiple labels. Rules derived by current associative classification algorithms overlap in their training objects, resulting in many redundant and useless rules. However, the proposed algorithm resolves the overlapping between rules in the classifier by generating rules that does not share training objects during the training phase, resulting in a more accurate classifier. Results obtained from experimenting on 20 binary, multi-class and multi-label data sets show that the proposed technique is able to produce classifiers that contain rules associated with multiple classes. Furthermore, the results reveal that removing overlapping of training objects between the derived rules produces highly competitive classifiers if compared with those extracted by decision trees and other associative classification techniques, with respect to error rate.  相似文献   

16.
挖掘Web日志中的分类关联规则   总被引:1,自引:0,他引:1       下载免费PDF全文
用户分类是Web访问模式挖掘研究的一个重要任务。提出一种应用关联分类技术对Web用户进行分类的方法:首先通过对Web日志文件预处理得到训练事务数据集,然后从该事务集中挖掘分类关联规则,并利用所挖掘的规则集构建了一个分类器,从而实现了根据用户访问历史对用户进行分类。  相似文献   

17.
遥感图像分类是遥感领域的研究热点之一.提出了一种基于自适应区间划分的模糊关联遥感图像分类方法(fuzzy associative remote sensing classification,FARSC).算法根据遥感图像分类的特点,利用模糊C均值聚类算法自适应地建立连续型属性模糊区间,使用新的剪枝策略对项集进行筛选从而避免生成无用规则,采用一种新的规则重要性度量方法对多模糊分类规则进行融合,从而有效地提高分类效率和精确度.在UCI数据和遥感图像上所作实验结果表明,算法具有较高的分类精度以及对样本数量变化的不敏感性,对于解决遥感图像分类问题,FARSC算法具有较高的实用性,是一种有效的遥感图像分类方法.  相似文献   

18.
Having received considerable interest in recent years, associative classification has focused on developing a class classifier, with lesser attention paid to the probability classifier used in direct marketing. While contributing to this integrated framework, this work attempts to increase the prediction accuracy of associative classification on class imbalance by adapting the scoring based on associations (SBA) algorithm. The SBA algorithm is modified by coupling it with the pruning strategy of association rules in the probabilistic classification based on associations (PCBA) algorithm, which is adjusted from the CBA for use in the structure of the probability classifier. PCBA is adjusted from CBA by increasing the confidence through under-sampling, setting different minimum supports (minsups) and minimum confidences (minconfs) for rules of different classes based on each distribution, and removing the pruning rules of the lowest error rate. Experimental results based on benchmark datasets and real-life application datasets indicate that the proposed method performs better than C5.0 and the original SBA do, and the number of rules required for scoring is significantly reduced.  相似文献   

19.
传统关联分类方法处理数量型数据时,“先离散,再学习”的步骤使新的测试样例可能无法找到合适的离散区间,形成离散盲目性问题。基于lazy的数量型关联分类作为一种新的关联分类法,它首先利用K-近邻分类思想为测试样例求得K-近邻作为新的训练数据集,然后对包含测试样例和K个近邻的数据集离散化,并在K-近邻组成的离散数据集上挖掘关联规则并构造分类器进行分类。最后,通过与传统CBA、CMAR、CPAR算法在7个常用UCI数量型数据集上进行的对比实验结果表明,基于lazy的数量型关联分类方法的平均分类准确率提高了0.66%~1.65%,证明了该方法的可行性。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号