首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到16条相似文献,搜索用时 250 毫秒
1.
针对最小化错误分类器不一定满足最小化误分类代价的问题,提出了一种代价敏感准则--即最小化误分类代价和最小化错误分类率的双重准则.研究了基于代价敏感准则的贝叶斯网络结构学习,要求搜索网络结构时在满足误分类代价最小的同时,还要满足错误分类率优于当前的最优模型.在UCI数据集上学习代价敏感贝叶斯网络,并与相应的生成贝叶斯网络和判别贝叶斯网络进行比较,结果表明了代价敏感贝叶斯网络的有效性.  相似文献   

2.
在现实生活中的很多应用里,对不同类别的样本错误地分类往往会造成不同程度的损失,这些损失可以用非均衡代价来刻画.代价敏感学习的目标就是最小化总体代价.提出了一种新的代价敏感分类方法——代价敏感大间隔分布学习机(cost-sensitive large margin distribution machine, CS-LDM).与传统的大间隔学习方法试图最大化“最小间隔”不同,CS-LDM在最小化总体代价的同时致力于对“间隔分布”进行优化,并通过对偶坐标下降方法优化目标函数,以有效地进行代价敏感学习.实验结果表明,CS-LDM的性能显著优于代价敏感支持向量机CS-SVM,平均总体代价下降了24%.  相似文献   

3.
万建武  杨明 《软件学报》2020,31(1):113-136
分类是机器学习的重要任务之一.传统的分类学习算法追求最低的分类错误率,假设不同类型的错误分类具有相等的损失.然而,在诸如人脸识别门禁系统、软件缺陷预测、多标记学习等应用领域中,不同类型的错误分类所导致的损失差异较大.这要求学习算法对可能导致高错分损失的样本加以重点关注,使得学习模型的整体错分损失最小.为解决该问题,代价敏感学习方法引起了研究者的极大关注.以代价敏感学习方法的理论基础作为切入点,系统阐述了代价敏感学习的主要模型方法以及代表性的应用领域.最后,讨论并展望了未来可能的研究趋势.  相似文献   

4.
基于代价敏感贝叶斯网络的烟叶感官质量评价   总被引:1,自引:0,他引:1  
贝叶斯网络在判别分类中具有很多优势,应用贝叶斯网络对烟叶感官质量进行预测和评价。一些烟叶质量指标的误分类代价不同,提出一种代价敏感贝叶斯网络。通过生成准则学习代价敏感贝叶斯网络的结构,进行代价敏感参数估计。应用代价敏感贝叶斯网络对一组烟叶进行感官质量预测和评价,结果表明了代价敏感贝叶斯网络在烟叶质量感官评价中的有效性。  相似文献   

5.
代价敏感分类器的比较研究   总被引:6,自引:1,他引:5  
简要地回顾了代价敏感学习的理论和现有的代价敏感学习算法.将代价敏感学习算法分为两类,分别是直接代价敏感学习和代价敏感元学习,其中代价敏感元学习可以将代价不敏感的分类器转换为代价敏感的分类器.提出了一种简单、通用、有效的元学习算法,称为经验阈值调整算法(简称ETA).评估了各种代价敏感元学习算法和ETA的性能.ETA几乎总是得到最低的误分类代价,而且它对误分类代价率最不敏感.还得到了一些关于元学习的其它有用结论.  相似文献   

6.
多标签代价敏感分类集成学习算法   总被引:12,自引:2,他引:10  
付忠良 《自动化学报》2014,40(6):1075-1085
尽管多标签分类问题可以转换成一般多分类问题解决,但多标签代价敏感分类问题却很难转换成多类代价敏感分类问题.通过对多分类代价敏感学习算法扩展为多标签代价敏感学习算法时遇到的一些问题进行分析,提出了一种多标签代价敏感分类集成学习算法.算法的平均错分代价为误检标签代价和漏检标签代价之和,算法的流程类似于自适应提升(Adaptive boosting,AdaBoost)算法,其可以自动学习多个弱分类器来组合成强分类器,强分类器的平均错分代价将随着弱分类器增加而逐渐降低.详细分析了多标签代价敏感分类集成学习算法和多类代价敏感AdaBoost算法的区别,包括输出标签的依据和错分代价的含义.不同于通常的多类代价敏感分类问题,多标签代价敏感分类问题的错分代价要受到一定的限制,详细分析并给出了具体的限制条件.简化该算法得到了一种多标签AdaBoost算法和一种多类代价敏感AdaBoost算法.理论分析和实验结果均表明提出的多标签代价敏感分类集成学习算法是有效的,该算法能实现平均错分代价的最小化.特别地,对于不同类错分代价相差较大的多分类问题,该算法的效果明显好于已有的多类代价敏感AdaBoost算法.  相似文献   

7.
代价敏感概率神经网络及其在故障诊断中的应用   总被引:3,自引:1,他引:2  
针对传统的分类算法人多以误分率最小化为目标,忽略了误分类型之间的差别和数据集的非平衡性的问题,提出代价敏感概率神经网络算法.该算法将代价敏感机制引入概率神经网络,用期望代价取代误分率,以期望代价最小化为目标,基于期望代价最小的贝叶斯决策规则预测新样本类别.采用工业现场数据和数据集German Credit验证了该算法的有效性.实验结果表明,该算法具有故障识别率高、泛化能力强、建模时间短等特点.  相似文献   

8.
标准的分类器设计大多都是基于整体最小化错误率.在入侵检测、医疗诊断等领域中,不同类别的误分类通常具有不等的损失.文中采用支持向量机建立模型,在组合算法的思想下引入组合代价敏感支持向量机,弥补传统代价敏感支持向量机在分类精度上的不可控.在模型对比中引入了更为实际的对比方式,从而能更好地选取模型,以减少总体误分代价.文中考虑不同类别的误分代价的前提下建立合适的支持向量机模型,并成功地应用在个人信用分类上  相似文献   

9.
一种新的代价敏感分类方法   总被引:1,自引:0,他引:1  
代价敏感学习(cost-sensitive learning)是指在机器学习的过程中考虑不同的误判(misclassification)带来的不同的代价(cost).论文将一项最新的贝叶斯分类研究成果应用到代价敏感学习中,提出了一种新的称之为代价敏感隐藏朴素贝叶斯分类算法.实验表明该方法比另一种典型的代价敏感算法更有效.  相似文献   

10.
多分类问题代价敏感AdaBoost算法   总被引:8,自引:2,他引:6  
付忠良 《自动化学报》2011,37(8):973-983
针对目前多分类代价敏感分类问题在转换成二分类代价敏感分类问题存在的代价合并问题, 研究并构造出了可直接应用于多分类问题的代价敏感AdaBoost算法.算法具有与连续AdaBoost算法 类似的流程和误差估计. 当代价完全相等时, 该算法就变成了一种新的多分类的连续AdaBoost算法, 算法能够确保训练错误率随着训练的分类器的个数增加而降低, 但不直接要求各个分类器相互独立条件, 或者说独立性条件可以通过算法规则来保证, 但现有多分类连续AdaBoost算法的推导必须要求各个分类器相互独立. 实验数据表明, 算法可以真正实现分类结果偏向错分代价较小的类, 特别当每一类被错分成其他类的代价不平衡但平均代价相等时, 目前已有的多分类代价敏感学习算法会失效, 但新方法仍然能 实现最小的错分代价. 研究方法为进一步研究集成学习算法提供了一种新的思路, 得到了一种易操作并近似满足分类错误率最小的多标签分类问题的AdaBoost算法.  相似文献   

11.
代价敏感的列表排序算法   总被引:1,自引:0,他引:1  
排序学习是信息检索与机器学习中的研究热点之一.在信息检索中,预测排序列表中顶部排序非常重要.但是,排序学习中一类经典的排序算法——列表排序算法——无法强调预测排序列表中顶部排序.为了解决此问题,将代价敏感学习的思想融入到列表排序算法中,提出代价敏感的列表排序算法框架.该框架是在列表排序算法的损失函数中对文档引入权重,且基于性能评价指标NDCG计算文档的权重.在此基础之上,进一步证明了代价敏感的列表排序算法的损失函数是NDCG损失的上界.为了验证代价敏感的列表排序算法的有效性,在此框架下提出了一种代价敏感的ListMLE排序算法,并对该算法开展序保持与泛化性的理论研究工作,从理论上验证了该算法具有序保持特性.在基准数据集上的实验结果表明,在预测排序列表中顶部排序中,代价敏感的ListMLE比传统排序学习算法能取得更好的性能.  相似文献   

12.
特征选择是机器学习和数据挖据中一个重要的预处理步骤,而类别不均衡数据的特征选择是机器学习和模式识别中的一个热点研究问题。多数传统的特征选择分类算法追求高精度,并假设数据没有误分类代价或者有同样的代价。在现实应用中,不同的误分类往往会产生不同的误分类代价。为了得到最小误分类代价下的特征子集,本文提出一种基于样本邻域保持的代价敏感特征选择算法。该算法的核心思想是把样本邻域引入现有的代价敏感特征选择框架。在8个真实数据集上的实验结果表明了该算法的优越性。  相似文献   

13.
In cost-sensitive learning, misclassification costs can vary for different classes. This paper investigates an approach reducing a multi-class cost-sensitive learning to a standard classification task based on the data space expansion technique developed by Abe et al., which coincides with Elkan's reduction with respect to binary classification tasks. Using this proposed reduction approach, a cost-sensitive learning problem can be solved by considering a standard 0/1 loss classification problem on a new distribution determined by the cost matrix. We also propose a new weighting mechanism to solve the reduced standard classification problem, based on a theorem stating that the empirical loss on independently identically distributed samples from the new distribution is essentially the same as the loss on the expanded weighted training set. Experimental results on several synthetic and benchmark datasets show that our weighting approach is more effective than existing representative approaches for cost-sensitive learning.  相似文献   

14.
The last decade has seen an increase in the attention paid to the development of cost-sensitive learning algorithms that aim to minimize misclassification costs while still maintaining accuracy. Most of this attention has been on cost-sensitive decision tree learning, whereas relatively little attention has been paid to assess if it is possible to develop better cost-sensitive classifiers based on Bayesian networks. Hence, this paper presents EBNO, an algorithm that utilizes Genetic algorithms to learn cost-sensitive Bayesian networks, where genes are utilized to represent the links between the nodes in Bayesian networks and the expected cost is used as a fitness function. An empirical comparison of the new algorithm has been carried out with respect to (a) an algorithm that induces cost-insensitive Bayesian networks to provide a base line, (b) ICET, a well-known algorithm that uses Genetic algorithms to induce cost-sensitive decision trees, (c) use of MetaCost to induce cost-sensitive Bayesian networks via bagging (d) use of AdaBoost to induce cost-sensitive Bayesian networks, and (e) use of XGBoost, a gradient boosting algorithm, to induce cost-sensitive decision trees. An empirical evaluation on 28 data sets reveals that EBNO performs well in comparison with the algorithms that produce single interpretable models and performs just as well as algorithms that use bagging and boosting methods.  相似文献   

15.
We tackle the structured output classification problem using the Conditional Random Fields (CRFs). Unlike the standard 0/1 loss case, we consider a cost-sensitive learning setting where we are given a non-0/1 misclassification cost matrix at the individual output level. Although the task of cost-sensitive classification has many interesting practical applications that retain domain-specific scales in the output space (e.g., hierarchical or ordinal scale), most CRF learning algorithms are unable to effectively deal with the cost-sensitive scenarios as they merely assume a nominal scale (hence 0/1 loss) in the output space. In this paper, we incorporate the cost-sensitive loss into the large margin learning framework. By large margin learning, the proposed algorithm inherits most benefits from the SVM-like margin-based classifiers, such as the provable generalization error bounds. Moreover, the soft-max approximation employed in our approach yields a convex optimization similar to the standard CRF learning with only slight modification in the potential functions. We also provide the theoretical cost-sensitive generalization error bound. We demonstrate the improved prediction performance of the proposed method over the existing approaches in a diverse set of sequence/image structured prediction problems that often arise in pattern recognition and computer vision domains.  相似文献   

16.
A novel framework is proposed for the design of cost-sensitive boosting algorithms. The framework is based on the identification of two necessary conditions for optimal cost-sensitive learning that 1) expected losses must be minimized by optimal cost-sensitive decision rules and 2) empirical loss minimization must emphasize the neighborhood of the target cost-sensitive boundary. It is shown that these conditions enable the derivation of cost-sensitive losses that can be minimized by gradient descent, in the functional space of convex combinations of weak learners, to produce novel boosting algorithms. The proposed framework is applied to the derivation of cost-sensitive extensions of AdaBoost, RealBoost, and LogitBoost. Experimental evidence, with a synthetic problem, standard data sets, and the computer vision problems of face and car detection, is presented in support of the cost-sensitive optimality of the new algorithms. Their performance is also compared to those of various previous cost-sensitive boosting proposals, as well as the popular combination of large-margin classifiers and probability calibration. Cost-sensitive boosting is shown to consistently outperform all other methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号