首页 | 本学科首页   官方微博 | 高级检索  
     

多标签代价敏感分类集成学习算法
引用本文:付忠良. 多标签代价敏感分类集成学习算法. 自动化学报, 2014, 40(6): 1075-1085. doi: 10.3724/SP.J.1004.2014.01075
作者姓名:付忠良
作者单位:1.中国科学院成都计算机应用研究所 成都 610041
基金项目:四川省科技支撑计划(2011GZ0171,2012GZ0106)资助
摘    要:
尽管多标签分类问题可以转换成一般多分类问题解决,但多标签代价敏感分类问题却很难转换成多类代价敏感分类问题.通过对多分类代价敏感学习算法扩展为多标签代价敏感学习算法时遇到的一些问题进行分析,提出了一种多标签代价敏感分类集成学习算法.算法的平均错分代价为误检标签代价和漏检标签代价之和,算法的流程类似于自适应提升(Adaptive boosting,AdaBoost)算法,其可以自动学习多个弱分类器来组合成强分类器,强分类器的平均错分代价将随着弱分类器增加而逐渐降低.详细分析了多标签代价敏感分类集成学习算法和多类代价敏感AdaBoost算法的区别,包括输出标签的依据和错分代价的含义.不同于通常的多类代价敏感分类问题,多标签代价敏感分类问题的错分代价要受到一定的限制,详细分析并给出了具体的限制条件.简化该算法得到了一种多标签AdaBoost算法和一种多类代价敏感AdaBoost算法.理论分析和实验结果均表明提出的多标签代价敏感分类集成学习算法是有效的,该算法能实现平均错分代价的最小化.特别地,对于不同类错分代价相差较大的多分类问题,该算法的效果明显好于已有的多类代价敏感AdaBoost算法.

关 键 词:多标签分类   代价敏感学习   集成学习   自适应提升算法   多分类
收稿时间:2013-07-30
修稿时间:2013-09-29

Cost-sensitive Ensemble Learning Algorithm for Multi-label Classification Problems
FU Zhong-Liang. Cost-sensitive Ensemble Learning Algorithm for Multi-label Classification Problems. ACTA AUTOMATICA SINICA, 2014, 40(6): 1075-1085. doi: 10.3724/SP.J.1004.2014.01075
Authors:FU Zhong-Liang
Affiliation:1. Chengdu Computer Applications Institute, Chinese Academy of Sciences, Chengdu 610041
Abstract:
Although a multi-label classification problem can be converted into a multi-class classification problem to solve, it is difficult that a multi-label cost-sensitive classification problem is converted into a multi-class cost-sensitive classification problem. A cost-sensitive ensemble learning algorithm for multi-label classification problems is proposed based on the analysis on the problems encountered when the multi-class cost-sensitive learning algorithm being extended to multi-label cost-sensitive learning algorithms. The average misclassification cost of the algorithm is composed of fall-out cost and the omission cost. The new algorithm's process is similar to the adaptive boosting (AdaBoost)algorithm, and the algorithm can automatically learn some weak classifiers and combine them into a strong classifier, and the average misclassification cost of the strong classifier will decrease as the weak classifiers gradually increase. The distinction between the cost-sensitive ensemble learning algorithm for multi-label classification problems and the cost-sensitive AdaBoost algorithm for multi-class classification problems is analyzed in detail, including the basis of output label and the meaning of the misclassification cost. Unlike general multi-class cost-sensitive classification problems, the misclassification cost of the multi-label cost-sensitive classification problems are subject to certain restrictions, and the specific restrictions are given. A multi-label AdaBoost algorithm and a multi-class cost-sensitive AdaBoost algorithm can be obtained by simplifying the proposed algorithm. Theoretical analysis and experimental results show that the proposed multi-label cost-sensitive classification ensemble learning algorithm is effective, and that the algorithm can minimize the average misclassification cost. In particular, when the difference of costs of the classes is large, the proposed algorithm can get better results than the existing multi-class cost-sensitive AdaBoost algorithms.
Keywords:Multi-label classification  cost-sensitive learning  ensemble learning  adaptive boosting (AdaBoost) algorithm  multi-class classification
点击此处可从《自动化学报》浏览原始摘要信息
点击此处可从《自动化学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号