首页 | 本学科首页   官方微博 | 高级检索  
     

基于谱聚类欠取样的不均衡数据SVM分类算法
引用本文:陶新民,张冬雪,付丹丹,郝思媛.基于谱聚类欠取样的不均衡数据SVM分类算法[J].控制与决策,2012,27(12):1761-1768.
作者姓名:陶新民  张冬雪  付丹丹  郝思媛
作者单位:哈尔滨工程大学信息与通信工程学院,哈尔滨,150001
基金项目:国家自然科学基金面上项目(61074076);中国博士后科学基金项目(20090450119);中国博士点新教师基金项目(20092304120017);黑龙江省博士后基金项目(LBH-Z08227)
摘    要:提出一种基于谱聚类欠取样的不均衡数据支持向量机(SVM)分类算法.该算法首先在核空间中对多数类样本进行谱聚类;然后在每个聚类中根据聚类大小和该聚类与少数类样本间的距离,选择具有代表意义的信息点;最终实现训练样本间的数目均衡.实验中将该算法同其他不均衡数据预处理方法相比较,结果表明该算法不仅能有效提高SVM算法对少数类的分类性能,而且总体分类性能及运行效率都有明显提高.

关 键 词:不均衡数据  SVM算法  谱聚类  欠取样
收稿时间:2011/7/22 0:00:00
修稿时间:2011/10/8 0:00:00

SVM classifier for unbalanced data based on spectrum cluster-based
under-sampling approaches
TAO Xin-min,ZHANG Dong-xue,HAO Si-yuan,FU Dan-dan.SVM classifier for unbalanced data based on spectrum cluster-based
under-sampling approaches[J].Control and Decision,2012,27(12):1761-1768.
Authors:TAO Xin-min  ZHANG Dong-xue  HAO Si-yuan  FU Dan-dan
Affiliation:(College of Information and Communication Engineering,Harbin Engineering University,Harbin 150001,China.)
Abstract:

An under-sampling unbalanced dataset support vector machine(SVM) algorithm based on spectrum cluster is
presented. Majority instances are clustered by using spectrum cluster in kernel space for resampling reprentative samples
with cluster information. The number of selected samples in each cluster is dependent on the size of each cluster and the
distance of the cluster to the all minority instances, which can not only reduce the number of majority instances, but also the
SVM classification performance under unbalanced dataset is improved by using the proposed method. In the experiments, the
proposed approach is compared with other data-preprocess methods for unbalanced dataset classification. The experimental
results show that the proposed method can not only improve classification performance of SVM algorithm in the minority
class data, but also increase the overall classification performance and effectivity.

Keywords:

unbalanced data|support vector machine algorithm|spectrum cluster|under-sampling

本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《控制与决策》浏览原始摘要信息
点击此处可从《控制与决策》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号