首页 | 本学科首页   官方微博 | 高级检索  
     

基于信息增益的多标签特征选择算法
引用本文:李 玲,刘华文,徐晓丹,赵建民.基于信息增益的多标签特征选择算法[J].计算机科学,2015,42(7):52-56.
作者姓名:李 玲  刘华文  徐晓丹  赵建民
作者单位:浙江师范大学数理与信息工程学院 金华321004,浙江师范大学数理与信息工程学院 金华321004;中国科学院数学与系统科学研究院 北京100055,浙江师范大学数理与信息工程学院 金华321004,浙江师范大学数理与信息工程学院 金华321004
基金项目:本文受国家自然科学基金(61100119,0,61272468,61170108,9),模式识别国家重点实验室开放课题基金(201204214),中国博士后基金(2013M530072),浙江省自然科学基金项目(LY14F020012),浙江省教育厅项目(Y201328291)资助
摘    要:多标签特征选择是一种提高多标签分类器性能的技术。针对目前这类技术在给出合理特征子集合时无法同时兼顾计算复杂度和标签间的相关性的问题,提出一种基于信息增益的多标签分类算法。该算法假设特征之间相互独立,首先使用单个特征与整个标签集合之间的信息增益来度量这两者的关联程度,再根据阈值删除不相关的特征以得到最优特征子集合。实验表明,该算法能有效地提高多标签分类器的分类性能。

关 键 词:数据挖掘  多标签分类  特征选择  信息增益

Multi-label Feature Selection Algorithm Based on Information Gain
LI Ling,LIU Hua-wen,XU Xiao-dan and ZHAO Jian-min.Multi-label Feature Selection Algorithm Based on Information Gain[J].Computer Science,2015,42(7):52-56.
Authors:LI Ling  LIU Hua-wen  XU Xiao-dan and ZHAO Jian-min
Affiliation:College of Mathematics,Physics and Information Engineering,Zhejiang Normal University,Jinhua 321004,China,College of Mathematics,Physics and Information Engineering,Zhejiang Normal University,Jinhua 321004,China;Academy of Mathematics and Systems Science,Chinese Academy of Sciences,Beijing 100055,China,College of Mathematics,Physics and Information Engineering,Zhejiang Normal University,Jinhua 321004,China and College of Mathematics,Physics and Information Engineering,Zhejiang Normal University,Jinhua 321004,China
Abstract:Multi-label feature selection is a kind of technology which is used to improve the performance of multi-label classifiers.However,the existing multi-label feature selection methods fail to make a tradeoff between the possible dependence among the labels and computational complexity in the process of obtaining reasonable feature subsets.Therefore,a novel multi-label feature selection algorithm based on information gain was proposed in the essay.It assumes that the features are independent with each other.The proposed method firstly uses information gain between a single feature and a set of labels to measure their correlation degree,and then removes the irrelevant and redundant features according to a threshold value.The experimental results show that the proposed algorithm can more effectively promote the performance of multi-label classifiers.
Keywords:Data mining  Multi-label learning  Feature selection  Information gain
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号