首页 | 本学科首页   官方微博 | 高级检索  
     

基于实例重要性的SVM解不平衡数据分类
引用本文:杨扬,李善平.基于实例重要性的SVM解不平衡数据分类[J].模式识别与人工智能,2009,22(6).
作者姓名:杨扬  李善平
作者单位:浙江大学,计算机科学与技术学院,杭州,310027
摘    要:在不平衡数据分类问题中,作为目标对象的少数类往往不易识别.常见方法存在需要显式设置实例重要度、仅仅间接支持少数类的识别等缺点.由此,文中提出基于实例重要性的支持向量机--ⅡSVM.它分为3个阶段.前两个阶段分别采用单类支持向昔机和二元支持向量机,将数据按照"最重要"、"较重要",和"不重要"3个档次重新组织.阶段3首先选择最重要的数据训练初始分类器,并通过显式设置早停止条件,直接支持少数类的识别.实验表明,ⅡSVM的平均分类性能优于目前的主流方法.

关 键 词:不平衡数据  实例重要性  支持向量机  重采样  代价敏感学习

Instance Importance Based SVM for Solving Imbalanced Data Classification
YANG Yang,LI Shan-Ping.Instance Importance Based SVM for Solving Imbalanced Data Classification[J].Pattern Recognition and Artificial Intelligence,2009,22(6).
Authors:YANG Yang  LI Shan-Ping
Abstract:In the problem of imbalanced data classification, the minority class is the classification target, but it is more difficult to be recognized than the majority class. The current popular classification algorithms have two main disadvantages: the explicit setup of instances importance degrees and the indirect support of the recognition of minority class. An instance importance based learning algorithm is proposed, namely instance importance based support vector machine (ⅡSVM). ⅡSVM is composed of three phases. In the first two phases, one class SVM and binary SVM are used respectively. And the training instances are divided into three groups: the most important group, important group and unimportant group. In the last phase, the most important instances are employed to train the initial classifier, and then the explicit stopping criteria are adopted to control the recognition of minority class directly. The experimental results illustrate that the performance of IISVM is superior to other standard or advanced solutions.
Keywords:Imbalanced Data  Instance Importance  Support Vector Machine  Resampling  Cost Sensitive Learning
本文献已被 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号