首页 | 本学科首页   官方微博 | 高级检索  
     


Imbalanced data classification via support vector machines and genetic algorithms
Authors:Jair Cervantes  Xiaoou Li
Affiliation:1. Posgrado e Investigación, UAEMEX (Autonomous University of Mexico State), Texcoco 56259, Mexico;2. Departamento de Computacion, CINVESTAV-IPN (National Polytechnic Institute), Mexico City 07360, Mexico
Abstract:Many real data sets are imbalanced and contain a large number of a certain type of patterns, but a very small number of another type of patterns. Normal classification methods, such as support vector machine (SVM), do not work well for these imbalanced data sets (IDS). It is difficult for SVMs to get the optimal separation hyperplane when they are trained with imbalanced data. In this paper, we propose a genetic algorithm (GA)-based classification method. A draft hyperplane and support vectors are first generated by SVMs. Then, GA is applied to compensate the imbalanced data. Finally, SVM is used again to find the best hyperplane from the generated data points. Compared with the other popular classification algorithms, our method has better classification accuracy for several IDS.
Keywords:genetic algorithm  support vector machine  imbalanced data  classification
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号