首页 | 本学科首页   官方微博 | 高级检索  
     

基于间隔理论的过采样集成算法
引用本文:张宗堂,陈喆,戴卫国.基于间隔理论的过采样集成算法[J].计算机应用,2019,39(5):1364-1367.
作者姓名:张宗堂  陈喆  戴卫国
作者单位:海军潜艇学院航海观通系,山东青岛,266000;海军潜艇学院航海观通系,山东青岛,266000;海军潜艇学院航海观通系,山东青岛,266000
摘    要:针对传统集成算法不适用于不平衡数据分类的问题,提出基于间隔理论的AdaBoost算法(MOSBoost)。首先通过预训练得到原始样本的间隔;然后依据间隔排序对少类样本进行启发式复制,从而形成新的平衡样本集;最后将平衡样本集输入AdaBoost算法进行训练以得到最终集成分类器。在UCI数据集上进行测试实验,利用F-measure和G-mean两个准则对MOSBoost、AdaBoost、随机过采样AdaBoost(ROSBoost)和随机降采样AdaBoost(RDSBoost)四种算法进行评价。实验结果表明,MOSBoost算法分类性能优于其他三种算法,其中,相对于AdaBoost算法,MOSBoost算法在F-measure和G-mean准则下分别提升了8.4%和6.2%。

关 键 词:不平衡数据  间隔理论  过采样方法  集成分类器  机器学习
收稿时间:2018-11-26
修稿时间:2018-12-12

Over sampling ensemble algorithm based on margin theory
ZHANG Zongtang,CHEN Zhe,DAI Weiguo.Over sampling ensemble algorithm based on margin theory[J].journal of Computer Applications,2019,39(5):1364-1367.
Authors:ZHANG Zongtang  CHEN Zhe  DAI Weiguo
Affiliation:Navigation and Observation Department, Navy Submarine Academy, Qingdao Shandong 266000, China
Abstract:In order to solve the problem that traditional ensemble algorithms are not suitable for imbalanced data classification, Over Sampling AdaBoost based on Margin theory (MOSBoost) was proposed. Firstly, the margins of original samples were obtained by pre-training. Then, the minority class samples were heuristic duplicated by margin sorting thus forming a new balanced sample set. Finally, the finall ensemble classifier was obtained by the trained AdaBoost with the balanced sample set as the input. In the experiment on UCI dataset, F-measure and G-mean were used to evaluate MOSBoost, AdaBoost, Random OverSampling AdaBoost (ROSBoost) and Random UnderSampling AdaBoost (RDSBoost). The experimental results show that MOSBoost is superior to other three algorithm. Compared with AdaBoost, MOSBoost improves 8.4% and 6.2% respctively under F-measure and G-mean criteria.
Keywords:imbalanced data                                                                                                                        margin theory                                                                                                                        over sampling method                                                                                                                        ensemble classifier                                                                                                                        machine learning
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号