首页 | 本学科首页   官方微博 | 高级检索  
     

融合Shapley值和粒子群优化算法的混合特征选择算法
引用本文:邓秀勤,李文洲,武继刚,刘太亨. 融合Shapley值和粒子群优化算法的混合特征选择算法[J]. 计算机应用, 2018, 38(5): 1245-1249. DOI: 10.11772/j.issn.1001-9081.2017112730
作者姓名:邓秀勤  李文洲  武继刚  刘太亨
作者单位:1. 广东工业大学 应用数学学院, 广州 510006;2. 广东工业大学 计算机学院, 广州 510006
基金项目:国家自然科学基金资助项目(61672171);广东工业大学研究生创新及竞赛项目(2017YJSCX039)。
摘    要:针对在模式分类问题中,数据往往存在不相关的或冗余的特征,从而影响分类的准确性的问题,提出一种融合Shapley值和粒子群优化算法的混合特征选择算法,以利用最少的特征获得最佳分类效果。在粒子群优化算法的局部搜索中引入博弈论的Shapley值,首先计算粒子(特征子集)中每个特征对分类效果的贡献值(Shapley值),然后逐步删除Shapley值最低的特征以优化特征子集,进而更新粒子,同时也增强了算法的全局搜索能力,最后将改进后的粒子群优化算法运用于特征选择,以支持向量机分类器的分类性能和选择的特征数目作为特征子集评价标准,对UCI机器学习数据集和基因表达数据集的17个具有不同特征数量的医疗数据集进行分类实验。实验结果表明所提算法能有效地删除数据集中55%以上不相关的或冗余的特征,尤其对于中大型数据集能删减80%以上,并且所选择的特征子集也具有较好的分类能力,分类准确率能提高2至23个百分点。

关 键 词:模式分类  粒子群优化算法  Shapley值  特征选择  支持向量机  
收稿时间:2017-11-18
修稿时间:2017-12-20

Hybrid feature selection algorithm fused Shapley value and particle swarm optimization
DENG Xiuqin,LI Wenzhou,WU Jigang,LIU Taiheng. Hybrid feature selection algorithm fused Shapley value and particle swarm optimization[J]. Journal of Computer Applications, 2018, 38(5): 1245-1249. DOI: 10.11772/j.issn.1001-9081.2017112730
Authors:DENG Xiuqin  LI Wenzhou  WU Jigang  LIU Taiheng
Affiliation:1. School of Applied Mathematics, Guangdong University of Technology, Guangzhou Guangdong 510006, China;2. School of Computers, Guangdong University of Technology, Guangzhou Guangdong 510006, China
Abstract:Concerning the problem that data often has irrelevant or redundant features which affect the classification accuracy in pattern classification problems, a hybrid feature selection method based on Shapley value and Particle Swarm Optimization (PSO) was proposed to obtain the best classification results with the fewest features. Firstly, the Shapley value of game theory was introduced into the local search of PSO algorithm. Then,by calculating the Shapley value of each feature in the particle (feature subset), the feature with the lowest Shapley value was gradually deleted to optimize the feature subset and update the particle, and enhance the global search ability of the algorithm at the same time. Finally, the improved particle swarm algorithm was applied to feature selection. The classification performance and the number of selected features of the support vector machine classifier were used as feature subset evaluation criteria. The classification experiments were performed on 17 medical data sets with different characteristic quantities of UCI machine learning data sets and gene expression data sets. The experimental results show that the proposed algorithm can remove more than 55% irrelevant or redundant features in the datasets effectively, especially more than 80% in the medium and large datasets, and the selected feature subset also has better classification ability,the classification accuracy can be increased by 2 to 23 percentage points.
Keywords:pattern classification  Particle Swarm Optimization (PSO) algorithm  Shapley value  feature selection  Support Vector Machine (SVM)  
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号