首页 | 本学科首页   官方微博 | 高级检索  
     


Evolutionary under-sampling based bagging ensemble method for imbalanced data classification
Authors:Bo Sun  Haiyan Chen  Jiandong Wang  Hua Xie
Affiliation:1. College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China2. National Key Lab of ATFM, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China
Abstract:In the class imbalanced learning scenario, traditional machine learning algorithms focusing on optimizing the overall accuracy tend to achieve poor classification performance especially for the minority class in which we are most interested. To solve this problem, many effective approaches have been proposed. Among them, the bagging ensemble methods with integration of the under-sampling techniques have demonstrated better performance than some other ones including the bagging ensemble methods integrated with the over-sampling techniques, the cost-sensitive methods, etc. Although these under-sampling techniques promote the diversity among the generated base classifiers with the help of random partition or sampling for the majority class, they do not take any measure to ensure the individual classification performance, consequently affecting the achievability of better ensemble performance. On the other hand, evolutionary under-sampling EUS as a novel undersampling technique has been successfully applied in searching for the best majority class subset for training a good-performance nearest neighbor classifier. Inspired by EUS, in this paper, we try to introduce it into the under-sampling bagging framework and propose an EUS based bagging ensemble method EUS-Bag by designing a new fitness function considering three factors to make EUS better suited to the framework. With our fitness function, EUS-Bag could generate a set of accurate and diverse base classifiers. To verify the effectiveness of EUS-Bag, we conduct a series of comparison experiments on 22 two-class imbalanced classification problems. Experimental results measured using recall, geometric mean and AUC all demonstrate its superior performance.
Keywords:class imbalanced problem  under-sampling  bagging  evolutionary under-sampling  ensemble learning  machine learning  data mining  
本文献已被 SpringerLink 等数据库收录!
点击此处可从《Frontiers of Computer Science》浏览原始摘要信息
点击此处可从《Frontiers of Computer Science》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号