首页 | 本学科首页   官方微博 | 高级检索  
     

一种新的过采样算法DB_SMOTE
引用本文:刘余霞,刘三民,刘 涛,王忠群.一种新的过采样算法DB_SMOTE[J].计算机工程与应用,2014,50(6):92-95.
作者姓名:刘余霞  刘三民  刘 涛  王忠群
作者单位:1.安徽工程大学 建筑工程学院,安徽 芜湖 241000 2.安徽工程大学 计算机与信息学院,安徽 芜湖 241000 3.南京航空航天大学 计算机科学与技术学院,南京 210016 4.安徽工程大学 管理工程学院,安徽 芜湖 241000
基金项目:国家自然科学基金(No.61300170,No.71371012);教育部人文社科基金(No.13YJA630098);安徽省自然科学基金重点资助项目(No.KJ2013A040);高校省级优秀青年人才基金重点项目(No.2013sQRL034zD);校青年基金(No.2013YQ31,No.2012YQ32).
摘    要:针对非平衡数据集中类分布信息不对称现象,提出一种新的过采样算法DB_SMOTE(Distance-based Synthetic Minority Over-sampling Technique),通过合成少数类新样本解决样本不足问题。算法基于样本与类中心距离,结合类聚集程度提取种子样本。根据SMOTE(Synthetic Minority Over-sampling Technique)算法思想,在种子样本上实现少数类新样本合成。根据种子样本与少数类中心距离构造新样本分布函数。基于此采样算法并在多个数据集上进行分类实验,结果表明DB_SMOTE算法是可行的。

关 键 词:非平衡数据学习  过采样  数据分类  

New oversampling algorithm DB_SMOTE
LIU Yuxia,LIU Sanmin,LIU Tao,WANG Zhongqun.New oversampling algorithm DB_SMOTE[J].Computer Engineering and Applications,2014,50(6):92-95.
Authors:LIU Yuxia  LIU Sanmin  LIU Tao  WANG Zhongqun
Affiliation:1.College of Civil Engineering and Architecture, Anhui Polytechnic University, Wuhu, Anhui 241000, China 2.College of Computer and Information, Anhui Polytechnic University, Wuhu, Anhui 241000, China 3.College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China 4.College of Management and Engineering, Anhui Polytechnic University, Wuhu, Anhui 241000, China
Abstract:In order to solve the asymmetry of class distribution information in imbalanced data, DB_SMOTE(Distance-based Synthetic Minority Over-sampling Technique)algorithm is presented by minority new sample synthetic. According to the distance between sample and the centre of class, seed sample is gained by combining class aggregation. Based on SMOTE (Synthetic Minority Over-sampling Technique), new sample is synthesized. Based upon the distance between seed sample and the centre of minority class, new sample distribution function is formed. Classification experiment results show DB_SMOTE is feasible.
Keywords:imbalanced data learning  oversampling  data classification
本文献已被 CNKI 维普 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号