首页 | 本学科首页   官方微博 | 高级检索  
     

基于拉普拉斯评分的多标记特征选择算法
引用本文:胡敏杰,林耀进,王晨曦,唐莉,郑荔平.基于拉普拉斯评分的多标记特征选择算法[J].计算机应用,2018,38(11):3167-3174.
作者姓名:胡敏杰  林耀进  王晨曦  唐莉  郑荔平
作者单位:闽南师范大学 计算机学院, 福建 漳州 363000
基金项目:国家自然科学基金资助项目(61672272);福建省教育厅科技项目(JAT170347,JAT170350)。
摘    要:针对传统的拉普拉斯评分特征选择算法只适应单标记学习,无法直接应用于多标记学习的问题,提出一种应用于多标记任务的拉普拉斯评分特征选择算法。首先,考虑样本在整体标记空间中共同关联和共同不关联的相关性,重新构建样本相似度矩阵;然后,将特征之间的相关性及冗余性判定引入拉普拉斯评分算法中,采用前向贪心搜索策略依次评价候选特征与已选特征的联合作用能力,用于评价特征的重要性;最后,在5个不同评价指标和6个多标记数据集上实验。实验结果表明:相比基于最大依赖的多标记维数约简方法(MDDM)、基于贝叶斯分类器的多标记特征选择算法(MLNB)及基于多元互信息的多标记分类特征选择算法(PMU),所提算法不仅分类性能最优,且存在显著性优异达65%。

关 键 词:特征选择  拉普拉斯  多标记分类  搜索策略  特征关联  
收稿时间:2018-03-20
修稿时间:2018-06-26

Multi-label feature selection algorithm based on Laplacian score
HU Minjie,LIN Yaojin,WANG Chenxi,TANG Li,ZHENG Liping.Multi-label feature selection algorithm based on Laplacian score[J].journal of Computer Applications,2018,38(11):3167-3174.
Authors:HU Minjie  LIN Yaojin  WANG Chenxi  TANG Li  ZHENG Liping
Affiliation:School of Computer Science, Minnan Normal University, Zhangzhou Fujian 363000, China
Abstract:Aiming at the problem that the traditional Laplacian score for feature selection cannot be directly applied to multi-label tasks, a multi-label feature selection algorithm based on Laplacian score was proposed. Firstly, the sample similarity matrix was reconstructed by the correlation of the common and non-correlated correlations of the samples in the overall label space. Then, the correlation and redundancy between features were introduced into Laplacian score, and a forward greedy search strategy was designed to evaluate the co-operation ability between candidate features and selected features, which was used to evaluate the importance of candidate features. Finally, extensive experiments were conducted on six multi-label data sets with five different evaluation criteria. The experimental results show that compared with Multi-label Dimensionality reduction via Dependence Maximization (MDDM), Feature selection for Multi-Label Naive Bayes classification (MLNB) and feature selection for multi-label classification using multivariate mutual information (PMU), the proposed algorithm not only has the best classification performance, but also has a remarkable performance of up to 65%.
Keywords:feature selection                                                                                                                        Laplacian score                                                                                                                        multi-label classification                                                                                                                        search strategy                                                                                                                        feature relevance
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号