首页 | 本学科首页   官方微博 | 高级检索  
     

基于改进加权压缩近邻与最近边界规则SVM训练样本约减选择算法
引用本文:胡正平,高文涛.基于改进加权压缩近邻与最近边界规则SVM训练样本约减选择算法[J].东北重型机械学院学报,2010(5):421-425.
作者姓名:胡正平  高文涛
作者单位:燕山大学信息科学与工程学院,河北秦皇岛066004
基金项目:国家自然科学基金资助项目(61071199);河北省自然科学基金资助项目(F2010001297:F2010001297);中国博士后科学基金资助项目(200902356;20080440124)
摘    要:大规模的训练集中通常含有许多相似样本和大量对分类器模型构造“无用”的冗余信息,利用全部样本进行训练不但会增加训练时间,还可能因为出现“过拟合”现象而导致泛化能力下降。针对这一问题,本文从最具代表性样本与最近边界样本两个角度综合考虑,提出一种基于改进加权压缩近邻与最近边界规则SVM训练样本约减选择算法。该算法考虑到有价值训练样本对SVM分类器性能的重要影响,引进减法聚类利用改进的加权压缩近邻方法选择最具代表性的样本进行训练,在此基础上利用最近边界规则在随机小样本池中选择边界样本提高分类精度。在UCI和KDD Cup 1999数据集上的实验结果表明,本文的算法能够有效地去除大训练集中的冗余信息,以较少的样本获得更好的分类性能。

关 键 词:样本选择  加权压缩近邻  最近边界  随机小样本池  支持向量机

Training sample selection algorithm for SVM based on modified weighted condensed nearest neighbor and close-to-boundary criterion
Authors:HU Zheng-ping  GAO Wen-tao
Affiliation:(College of Information Science and Engineering, Yanshan University, Qinhuangdao, Hebei 066004, China)
Abstract:Large-scale training sets usually contain large amount of similar samples and redundant information, resulting in a longer training time and poor generalization ability due to over-fitting. To deal with this problem, a training sample selection algorithm for SVM based on modified weighted condensed nearest neighbor and close-to-boundary criterion is proposed. Considering the significance of valuable training sets for the performance of SVM classification,the presented method combined the most representative samples with close-to-boundary samples and utilized the modified weighted CNN rule to select the most representative samples for training with subtractive clustering approach, and then used close-to-boundary criterion to select boundary samples to improve classification accuracy in random small pools. Experimental results on UCI and KDD Cup 1999 datasets show that the proposed algorithm can eliminate the redundancy, achieve better classification performance with fewer samples.
Keywords:sample selection  weighted CNN  close-to-boundary criterion  random small pools  support vector machines
本文献已被 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号