首页 | 本学科首页   官方微博 | 高级检索  
     

基于三支决策的不平衡数据过采样方法
引用本文:胡峰,王蕾,周耀.基于三支决策的不平衡数据过采样方法[J].电子学报,2018,46(1):135-144.
作者姓名:胡峰  王蕾  周耀
作者单位:计算智能重庆市重点实验室(重庆邮电大学), 重庆 400065
摘    要:采样是解决不平衡数据分类问题的一个有效途径.文中结合三支决策理论,根据样本分布将样本划分成三个区域:正域、边界域和负域;在此基础上,分别对边界域和负域中的小类样本进行不同的过采样处理,提出了一种基于三支决策的不平衡数据过采样算法(TWD-IDOS算法).实验结果表明,在C4.5、KNN和CART等分类器上,文中提出的算法能有效解决不平衡数据的二分类问题,在Recall、F-value、AUC等指标上优于文献中的过采样算法.

关 键 词:三支决策  邻域粗糙集  边界采样  不平衡数据  SMOTE  
收稿时间:2016-05-10

An Oversampling Method for Imbalance Data Based on Three-Way Decision Model
HU Feng,WANG Lei,ZHOU Yao.An Oversampling Method for Imbalance Data Based on Three-Way Decision Model[J].Acta Electronica Sinica,2018,46(1):135-144.
Authors:HU Feng  WANG Lei  ZHOU Yao
Affiliation:Chongqing Key Laboratory of Computational Intelligence(Chongqing University of Posts and Telecommunications), Chongqing 400065, China
Abstract:Sampling is an effective way to solve the problem of unbalanced data classification.According to the distribution of samples,we employ the three-way decision model to divide the universe into three parts:positive region,boundary region and negative region.After that,we oversample the minority class samples in boundary region and negative region respectively.Then,a novel oversampling algorithm for imbalance data based on three-way decision model,namely TWD-IDOS,is developed.The experimental results show that the proposed method can effectively solve the two-class classification problems of imbalanced data and has a better performance in such measures (Recall、F-value、AUC) on C45,KNN and CART classifiers than other oversampling methods.
Keywords:three-way decision  neighborhood rough set  boundary sampling  imbalanced data  SMOTE  
点击此处可从《电子学报》浏览原始摘要信息
点击此处可从《电子学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号