首页 | 本学科首页   官方微博 | 高级检索  
     

基于Filter Wrapper模式的特征选择算法*
引用本文:周传华,柳智才,丁敬安,周家亿.基于Filter Wrapper模式的特征选择算法*[J].计算机应用研究,2019,36(7).
作者姓名:周传华  柳智才  丁敬安  周家亿
作者单位:安徽工业大学管理科学与工程学院,安徽工业大学管理科学与工程学院,安徽工业大学管理科学与工程学院,安徽工业大学管理科学与工程学院
基金项目:国家自然科学基金资助项目、安徽省留学人员创新项目择优资助计划
摘    要:特征选择是数据挖掘、机器学习和模式识别中始终面临的一个重要问题。针对类和特征分布不均时,传统信息增益在特征选择中存在的选择偏好问题,本文提出了一种基于信息增益率与随机森林的特征选择算法。该算法结合Filter和Wrapper模式的优点,首先从信息相关性和分类能力两个方面对特征进行综合度量,然后采用序列前向选择(Sequential Forward Selection, SFS)策略对特征进行选择,并以分类精度作为评价指标对特征子集进行度量,从而获取最优特征子集。实验结果表明,本文算法不仅能够达到特征空间降维的效果,而且能够有效提高分类算法的分类性能和查全率。

关 键 词:信息增益率  随机森林  特征选择  Filter模式  Wrapper模式
收稿时间:2018/1/16 0:00:00
修稿时间:2019/5/25 0:00:00

Feature Selection Algorithm Based on Filter Wrapper Pattern
Zhou Chuan-hu,Liu Zhi-cai,Ding Jing-an and Zhou Jia-yi.Feature Selection Algorithm Based on Filter Wrapper Pattern[J].Application Research of Computers,2019,36(7).
Authors:Zhou Chuan-hu  Liu Zhi-cai  Ding Jing-an and Zhou Jia-yi
Affiliation:School of Management Science and Engineering, Anhui University of Technology,,,
Abstract:Feature selection is one of the most important issues in data mining, machine learning and pattern recognition. Aiming at the problem of preference of traditional information gain algorithm in feature selection when the class and feature are unevenly distributed, this paper proposes a new feature selection algorithm based on information gain ratio and random forest. The proposed algorithm combined with the advantages of Filter and Wrapper modes. First, a comprehensive measurement of features is carried out from two aspects of information correlation and classification ability. Second, Sequential Forward Selection (SFS) strategy is used to select the features, and the classification accuracy is used as the evaluation index to measure the feature subset. Finally, obtain the optimal feature subset. The experimental results show that the proposed algorithm can not only achieve the effect of dimension reduction in feature space, but also effectively improve the classification performance and recall rate of classification algorithm.
Keywords:information gain ratio  random forest  feature selection  filter mode  wrapper mode
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号