首页 | 本学科首页   官方微博 | 高级检索  
     

高维小样本分类问题中特征选择研究综述
引用本文:王翔,胡学钢.高维小样本分类问题中特征选择研究综述[J].计算机应用,2017,37(9):2433-2438.
作者姓名:王翔  胡学钢
作者单位:1. 合肥工业大学 计算机信息学院, 合肥 230009; 2. 安徽省科学技术情报研究所 文献情报分析中心, 合肥 230011
基金项目:国家973计划项目(2016YFC0801406);国家自然科学基金资助项目(61673152);安徽省自然科学基金资助项目(1408085QF136)。
摘    要:随着生物信息学、基因表达谱微阵列、图像识别等技术的发展,高维小样本分类问题成为数据挖掘(包括机器学习、模式识别)中的一项挑战性任务,容易引发"维数灾难"和过拟合问题。针对这个问题,特征选择可以有效避免维数灾难,提升分类模型泛化能力,成为研究的热点,有必要对国内外高维小样本特征选择主要研究情况进行综述。首先分析了高维小样本特征选择问题的本质;其次,根据其算法的本质区别,重点对高维小样本数据的特征选择方法进行分类剖析和比较;最后对高维小样本特征选择研究面临的挑战以及研究方向作了展望。

关 键 词:特征选择  高维数据  小样本学习  信息过滤  支持向量机  
收稿时间:2017-03-27
修稿时间:2017-04-21

Overview on feature selection in high-dimensional and small-sample-size classification
WANG Xiang,HU Xuegang.Overview on feature selection in high-dimensional and small-sample-size classification[J].journal of Computer Applications,2017,37(9):2433-2438.
Authors:WANG Xiang  HU Xuegang
Affiliation:1. School of Computer and Information, Hefei University of Technology, Hefei Anhui 230009, China;
2. Literature Information Analysis Department, Anhui Institute of Scientific and Technical Information, Hefei Anhui 230011, China
Abstract:With the development of bioinformatics, gene expression microarray and image recognition, classification on high-dimensional and small-sample-size data has become a challenging task in data ming, machine learning and pattern recognition as well. High-dimensional and small-sample-size data may cause the problem of "curse of dimensionality" and overfitting. Feature selection can prevent the "curse of dimensionality" effectively and promote the generalization ability of classification mode, and thus become a hot research topic. Accordingly, some recent development of world-wide research on feature selection in high-dimensional and small-sample-size classification was briefly reviewed. Firstly, the nature of high-dimensional and small-sample feature selection was analyzed. Secondly, according to their essential difference, feature selection algorithms for high-dimensional and small-sample-size classification were divided into four categories and compared to summarize their advantages and disadvantages. Finally, challenges and prospects for future trends of feature selection in high-dimensional small-sample-size data were proposed.
Keywords:feature selection                                                                                                                        high-dimensional data                                                                                                                        small-sample-size learning                                                                                                                        information filtering                                                                                                                        Support Vector Machine (SVM)
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号