首页 | 本学科首页   官方微博 | 高级检索  
     

不平衡数据分类的混合算法
引用本文:韩敏,朱新荣. 不平衡数据分类的混合算法[J]. 控制理论与应用, 2011, 28(10): 1485-1489
作者姓名:韩敏  朱新荣
作者单位:大连理工大学电子信息与电气工程学部,辽宁大连,116024
基金项目:国家自然科学基金资助项目(61074096); 国家科技支撑计划资助项目(2006BAB14B05); 国家重点基础研究资助项目(2006CB403405).
摘    要:针对传统分类算法处理不平衡数据时,小类的分类精度过低问题,提出一种径向基函数神经网络和随机森林集成的混合分类算法.在小类样本之间用随机插值方式平衡数据集的分布,利用受试者特征曲线在置信度为95%下的面积为标准去除冗余特征;之后对输入数据用Bagging技术进行扰动,并以径向基函数神经网络作为随机森林中的基分类器,采用绝大多数投票方法进行决策的融合和输出.将该算法应用于UCI数据,以G均值和受试者特征曲线下的面积为评判标准,结果表明该方法能够有效地提高中度和高度不平衡数据的分类精度.

关 键 词:不平衡数据  随机森林  径向基函数神经网络  受试者特征曲线
收稿时间:2010-06-25
修稿时间:2010-11-02

Hybrid algorithm for classification of unbalanced datasets
HAN Min and ZHU Xin-rong. Hybrid algorithm for classification of unbalanced datasets[J]. Control Theory & Applications, 2011, 28(10): 1485-1489
Authors:HAN Min and ZHU Xin-rong
Affiliation:Faculty of Electronic Information and Electrical Engineering, Dalian University of Technology,Faculty of Electronic Information and Electrical Engineering, Dalian University of Technology
Abstract:A novel hybrid algorithm of radial basis function neural network(RBFNN) integrated with the random forest algorithm is proposed to improve the poor classification result produced by traditional algorithm in classifying minor class of unbalanced datasets. Firstly, random interpolations are inserted between adjacent data in the minor dataset to balance the data distribution. Receiver operator characteristics(ROC) with degree of confidence less than 95% are considered the redundant characteristic and are deleted. The input data are perturbed by the Bagging technique. Radial Basis Function Neural Network is employed to be the basic classifier in the random forest. The fusion of decisions and the outputs are determined by the vast majority of votes. This method is applied to UCI dataset. The precision of G-mean and the area under the ROC demonstrate the improvement of the accuracy in the classifications of medium-size unbalanced and largesize unbalance class data sets.
Keywords:imbalanced data   random forest   radial basis function neural network(RBFNN)   receiver operator characteristics(ROC)
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《控制理论与应用》浏览原始摘要信息
点击此处可从《控制理论与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号