首页 | 官方网站   微博 | 高级检索  
     

混合Filter与改进自适应GA的特征选择方法
引用本文:邱云飞,高华聪.混合Filter与改进自适应GA的特征选择方法[J].计算机工程与应用,2021,57(11):95-102.
作者姓名:邱云飞  高华聪
作者单位:辽宁工程技术大学 软件学院,辽宁 葫芦岛 125100
摘    要:针对高维度小样本数据在特征选择时出现的维数灾难和过拟合的问题,提出一种混合Filter模式与Wrapper模式的特征选择方法(ReFS-AGA)。该方法结合ReliefF算法和归一化互信息,评估特征的相关性并快速筛选重要特征;采用改进的自适应遗传算法,引入最优策略平衡特征多样性,同时以最小化特征数和最大化分类精度为目标,选择特征数作为调节项设计新的评价函数,在迭代进化过程中高效获得最优特征子集。在基因表达数据上利用不同分类算法对简化后的特征子集分类识别,实验结果表明,该方法有效消除了不相关特征,提高了特征选择的效率,与ReliefF算法和二阶段特征选择算法mRMR-GA相比,在取得最小特征子集维度的同时平均分类准确率分别提高了11.18个百分点和4.04个百分点。

关 键 词:特征选择  Filter模式  ReliefF算法  归一化互信息  自适应遗传算法  

Hybrid Filter and Improved Adaptive GA for Feature Selection
QIU Yunfei,GAO Huacong.Hybrid Filter and Improved Adaptive GA for Feature Selection[J].Computer Engineering and Applications,2021,57(11):95-102.
Authors:QIU Yunfei  GAO Huacong
Affiliation:School of Software, Liaoning Technical University, Huludao, Liaoning 125100, China
Abstract:Aiming at the problem of dimension disaster and over fitting in feature selection of high dimension small sample data, this paper proposes a feature selection method(ReFS-AGA) based on mixed Filter mode and Wrapper mode. Firstly, the ReliefF algorithm and normalized mutual information are combined to evaluate the correlation of features and quickly select important features. Then, an improved adaptive genetic algorithm is used to balance the diversity of features. At the same time, the objective is to minimize the number of features and maximize the classification accuracy, and the number of features is selected as the adjusting item to design a new evaluation function, which efficiently obtains the optimal feature subset in the iterative evolution process. In this paper, different classification algorithms are used to classify and recognize the simplified feature subset on the gene expression data. The experimental result shows that this method effectively eliminates the irrelevant features and improves the efficiency of feature selection. Compared with the ReliefF algorithm and the two-stage feature selection algorithm mRMR-GA, the average classification accuracy is improved by 11.18 percentage points and 4.04 percentage points respectively when the minimum feature subset dimension is obtained.
Keywords:feature selection  Filter mode  ReliefF algorithm  normalized mutual information  adaptive genetic algorithm  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号