首页 | 本学科首页   官方微博 | 高级检索  
     

用于处理不平衡样本的改进近似支持向量机新算法
引用本文:刘 艳,钟 萍,陈 静,宋晓华,何 云.用于处理不平衡样本的改进近似支持向量机新算法[J].计算机应用,2014,34(6):1618-1621.
作者姓名:刘 艳  钟 萍  陈 静  宋晓华  何 云
作者单位:1. 燕京理工学院 机电学院,河北 廊坊 065201 2. 中国农业大学 理学院,北京 100083 3.
基金项目:国家自然科学基金资助项目
摘    要:近似支持向量机(PSVM)在处理不平衡样本时,会过拟合样本点数较多的一类,低估样本点数较少的类的错分误差,从而导致整体样本的分类准确率下降。针对该问题,提出一种用于处理不平衡样本的改进的PSVM新算法。新算法不仅给正、负类样本赋予不同的惩罚因子,而且在约束条件中新增参数,使得分类面更具灵活性。该算法先对训练集训练获得最优参数,然后再对测试集进行训练获得分类超平面,最后输出分类结果。UCI数据库中9组数据集的实验结果表明:新算法提高了样本的分类准确率,在线性的情况下平均提高了2.19个百分点,在非线性的情况下平均提高了3.14个百分点,有效地提高了模型的泛化能力。

关 键 词:近似支持向量机  不平衡样本  参数  惩罚因子  模型改进
收稿时间:2013-11-18
修稿时间:2014-01-21

Modified proximal support vector machine algorithm for dealing with unbalanced samples
LIU Yan ZHONG Ping CHEN Jing SONG Xiaohua HE Yun.Modified proximal support vector machine algorithm for dealing with unbalanced samples[J].journal of Computer Applications,2014,34(6):1618-1621.
Authors:LIU Yan ZHONG Ping CHEN Jing SONG Xiaohua HE Yun
Affiliation:1. College of Mechanical and Electrical Engineering, Yanching Institute of Technology, Langfang Hebei 065201, China
2. College of Science, China Agricultural University, Beijing 100083, China;
Abstract:When Proximal Support Vector Machine (PSVM) deals with unbalanced samples, it will overfit the class with large samples and underestimate the misclassification error of the class with small samples, resulting in the decline of accuracy in overall samples. To solve this problem, a modified PSVM used for dealing with unbalanced samples was proposed. The new algorithm not only set different punishments for positive and negative samples, but also added a new parameter to the constraint, making the classification hyperplane more flexible. Firstly, the new algorithm trained the training set to obtain the optimal parameters, then the classification hyperplane was obtained by training the test set. Finally, the classification results was output. The experiments presented by 9 datasets in UCI database show that the new algorithm improves the classification accuracy of the samples, by 2.19 and 3.14 percentage points in the linear and nonlinear case respectively. The generalization ability of the algorithm is strengthened effectively.
Keywords:
本文献已被 CNKI 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号