首页 | 本学科首页   官方微博 | 高级检索  
     


Boosting support vector machines for imbalanced data sets
Authors:Benjamin X. Wang  Nathalie Japkowicz
Affiliation:1. Datalong technology Ltd., 430074, Wuhan, Hubei, China
2. School of Information Technology and Engineering, University of Ottawa, 800 King Edward Ave., P.O. Box 450 Stn.A, Ottawa, ON, K1N 6N5, Canada
Abstract:Real world data mining applications must address the issue of learning from imbalanced data sets. The problem occurs when the number of instances in one class greatly outnumbers the number of instances in the other class. Such data sets often cause a default classifier to be built due to skewed vector spaces or lack of information. Common approaches for dealing with the class imbalance problem involve modifying the data distribution or modifying the classifier. In this work, we choose to use a combination of both approaches. We use support vector machines with soft margins as the base classifier to solve the skewed vector spaces problem. We then counter the excessive bias introduced by this approach with a boosting algorithm. We found that this ensemble of SVMs makes an impressive improvement in prediction performance, not only for the majority class, but also for the minority class.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号