首页 | 本学科首页   官方微博 | 高级检索  
     

不平衡类数据挖掘研究综述
引用本文:翟云,杨炳儒,曲武.不平衡类数据挖掘研究综述[J].计算机科学,2010,37(10):27-32.
作者姓名:翟云  杨炳儒  曲武
作者单位:1. 北京科技大学信息工程学院,北京,100083;聊城大学计算机学院,聊城,252059
2. 北京科技大学信息工程学院,北京,100083
基金项目:本文受国家自然科学基金(60675030,60875029)资助。
摘    要:综述了近年来国内外对不平衡类数据挖掘的主要研究进展。首先分析了不平衡类数据挖掘的本质。其次,详细探讨了处理不平衡类数据挖掘的各种技术,并根据其本质区别,从数据层次和算法层次分别对目前存在的各种技术方法进行了深入剖析和全面比较。最后,指出当前不平衡类数据挖掘研究的热点以及将来需要重点关注的主要问题。

关 键 词:机器学习,不平衡类数据,重采样,代价敏感学习
收稿时间:2009/11/6 0:00:00
修稿时间:2010/1/25 0:00:00

Survey of Mining Imbalanced Datasets
ZHAI Yun,YANG Bing-ru,QU Wu.Survey of Mining Imbalanced Datasets[J].Computer Science,2010,37(10):27-32.
Authors:ZHAI Yun  YANG Bing-ru  QU Wu
Affiliation:(School of Information Engineering,University of Science and Technology Beijing,Beijing 100083,China)(College of Computer Science, Liaocheng University, Liaocheng 252059, China)
Abstract:This paper reviewed the present situation of mining data in unbalanced classes at home and abroad in recent years. Firstly, it analysed in-depth the existing problems and their resulting nature. hhen it in detail dealt with various state-of-the-art data mining techniques under the unbalanced learning scenario. Moreover, from the data-level and algorithm-level respectively it analysed and compared them omprehensively in accordance with essential difference. At the same time, the paper summaricd measure metrics evaluating performance of mining imbalance data sets. Also, the paper pointed out recent hot issues of theoretic studies and applications. Finally, the perspectives on future work were also discussed.
Keywords:Machine learning  Imbalanced classification  Resampling  Cost sensitive learning
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机科学》浏览原始摘要信息
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号