首页 | 本学科首页   官方微博 | 高级检索  
     

基于马氏抽样的SVM非平衡数据分类算法的泛化性能研究
引用本文:徐婕,贺美美.基于马氏抽样的SVM非平衡数据分类算法的泛化性能研究[J].电子学报,2018,46(11):2660-2670.
作者姓名:徐婕  贺美美
作者单位:湖北大学计算机与信息工程学院, 湖北武汉 430062
摘    要:本文将样本为独立同分布的情形减弱为一致遍历马氏链的情形去研究了非平衡数据分类算法的泛化性能,提出了基于马氏抽样的SVM非平衡数据分类算法、基于马氏抽样的EDSVM非平衡数据分类算法和基于马氏抽样的SVM-WKNN非平衡数据分类算法.并用UCI数据库中的10个实际不平衡数据集进行数值实验,实验结果表明基于马氏抽样的上述三种算法的错分率均比基于随机抽样的对应算法的错分率要低,且上述三种算法中,基于马氏抽样的SVM-WKNN非平衡数据分类算法的泛化性能最好.

关 键 词:马氏抽样  支持向量机  k近邻算法  非平衡数据  
收稿时间:2017-09-04

Research on the Generalization Performance of SVM Imbalanced Data Classification Algorithm Based on Markov Sampling
XU Jie,HE Mei-mei.Research on the Generalization Performance of SVM Imbalanced Data Classification Algorithm Based on Markov Sampling[J].Acta Electronica Sinica,2018,46(11):2660-2670.
Authors:XU Jie  HE Mei-mei
Affiliation:School of Computer Science and Information Engineering, Hubei University, Wuhan, Hubei, 430062, China
Abstract:This paper changes the assumption that samples are independent and identically distributed to that samples are uniformly ergodic Markov chains,which make it convenient for us to study the generalization performance of the imbalanced data classification algorithm,and SVM imbalanced data classification algorithm based on Markov sampling,EDSVM imbalanced data classification algorithm based on Markov sampling and SVM-WKNN imbalanced data classification algorithm based on Markov sampling are proposed.The numerical experiments of ten actual imbalanced datasets in the UCI database show that the misclassification rate of the above algorithm based on Markov sampling is lower than that of the corresponding algorithm based on random sampling,and the above three algorithms,SVM-WKNN imbalanced data classification algorithm based on Markov sampling has the best generalization performance.
Keywords:Markov sampling  support vector machine  k-nearest neighbor  imbalanced data  
点击此处可从《电子学报》浏览原始摘要信息
点击此处可从《电子学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号