首页 | 本学科首页   官方微博 | 高级检索  
     

森林优化特征选择算法的增强与扩展
引用本文:刘兆赓,李占山,王丽,王涛,于海鸿. 森林优化特征选择算法的增强与扩展[J]. 软件学报, 2020, 31(5): 1511-1524
作者姓名:刘兆赓  李占山  王丽  王涛  于海鸿
作者单位:吉林大学软件学院,吉林长春 130012;符号计算与知识工程教育部重点实验室(吉林大学),吉林长春 130012;吉林大学软件学院,吉林长春 130012;吉林大学计算机科学与技术学院,吉林长春 130012;符号计算与知识工程教育部重点实验室(吉林大学),吉林长春 130012;长春工业大学计算机科学与工程学院,吉林长春 130012;吉林大学计算机科学与技术学院,吉林长春 130012;符号计算与知识工程教育部重点实验室(吉林大学),吉林长春 130012
基金项目:国家自然科学基金(61672261);吉林省自然科学基金(20180101043JC);吉林省发改委产业技术研究与开发专项资金(2019C053-9)
摘    要:特征选择作为一种重要的数据预处理方法,不但能解决维数灾难问题,还能提高算法的泛化能力.各种各样的方法已被应用于解决特征选择问题,其中,基于演化计算的特征选择算法近年来获得了更多的关注并取得了一些成功.近期研究结果表明,森林优化特征选择算法具有更好的分类性能及维度缩减能力.然而,初始化阶段的随机性、全局播种阶段的人为参数设定,影响了该算法的准确率和维度缩减能力;同时,算法本身存在着高维数据处理能力不足的本质缺陷.从信息增益率的角度给出了一种初始化策略,在全局播种阶段,借用模拟退火控温函数的思想自动生成参数,并结合维度缩减率给出了适应度函数;同时,针对形成的优质森林采取贪心算法,形成一种特征选择算法EFSFOA(enhanced feature selection using forest optimization algorithm).此外,在面对高维数据的处理时,采用集成特征选择的方案形成了一个适用于EFSFOA的集成特征选择框架,使其能够有效处理高维数据特征选择问题.通过设计对比实验,验证了EFSFOA与FSFOA相比在分类准确率和维度缩减率上均有明显的提高,高维数据处理能力更是提高...

关 键 词:enhanced feature selection using forest optimization algorithm(EFSFOA)  高维  特征选择  演化计算
收稿时间:2018-07-12
修稿时间:2018-08-05

Enhancement and Extension of Feature Selection Using Forest Optimization Algorithm
LIU Zhao-Geng,LI Zhan-Shan,WANG Li,WANG Tao,YU Hai-Hong. Enhancement and Extension of Feature Selection Using Forest Optimization Algorithm[J]. Journal of Software, 2020, 31(5): 1511-1524
Authors:LIU Zhao-Geng  LI Zhan-Shan  WANG Li  WANG Tao  YU Hai-Hong
Affiliation:College of Software, Jilin University, Changchun 130012, China;Key Laboratory of Symbolic Computation and Knowledge Engineering (Jilin University), Ministry of Education, Changchun 130012, China;College of Software, Jilin University, Changchun 130012, China;College of Computer Science and Technology, Jilin University, Changchun 130012, China;Key Laboratory of Symbolic Computation and Knowledge Engineering (Jilin University), Ministry of Education, Changchun 130012, China;College of Computer Science and Engineering, Changchun University of Technology, Changchun 130012, China
Abstract:As an important data preprocessing method, feature selection can not only solve the dimensionality disaster problem, but also improve the generalization ability of algorithms. A variety of methods have been applied to solve feature selection problems, where evolutionary computation techniques have recently gained much attention and shown some success. Recent study has shown that feature selection using forest optimization algorithm has better classification performance and dimensional reduction ability. However, the randomness of initialization phase and the artificial parameter setting of global seeding phase affect the accuracy and the dimension reduction ability of the algorithm. At the same time, the algorithm itself has the essential defect of insufficient high-dimensional data processing capability. In this study, an initialization strategy is given from the perspective of information gain rate, parameter is automatically generated by using simulated annealing temperature control function during global seeding, a fitness function is given by combining dimension reduction rate, using greedy algorithm to select the best tree from the high-quality forest obtained, and a feature selection algorithm EFSFOA (enhanced feature selection using forest optimization algorithm) is proposed. In addition, in the face of high-dimensional data processing, ensemble feature selection scheme is used to form an ensemble feature selection framework suitable for EFSFOA, so that it can effectively deal with the problem of high-dimensional data feature selection. Through designing some contrast experiments, it is verified that EFSFOA has significantly improved classification accuracy and dimensionality reduction rate compared with FSFOA, and the high-dimensional data processing capability has been increased to 100 000 dimensions. Comparing EFSFOA with other efficient evolutionary computation for feature selection approaches which have been proposed in recent years, EFSFOA still has strong competitiveness.
Keywords:enhanced feature selection using forest optimization algorithm (EFSFOA)  high-dimensional  feature selection  evolutionary computation
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号