首页 | 本学科首页   官方微博 | 高级检索  
     

基于相关性和冗余度的联合特征选择方法
引用本文:周城,葛斌,唐九阳,肖卫东.基于相关性和冗余度的联合特征选择方法[J].计算机科学,2012,39(4):181-184.
作者姓名:周城  葛斌  唐九阳  肖卫东
作者单位:国防科技大学信息系统工程重点实验室 长沙410073
基金项目:国家自然科学基金,国防科技大学优秀研究生创新基金
摘    要:比较研究了与类别信息无关的文档频率和与类别信息有关的信息增益、互信息和χ2统计特征选择方法,在此基础上分析了以往直接组合这两类特征选择方法的弊端,并提出基于相关性和冗余度的联合特征选择算法。该算法将文档频率方法分别与信息增益、互信息和χ2统计方法联合进行特征选择,旨在删除冗余特征,并保留有利于分类的特征,从而提高文本情感分类效果。实验结果表明,该联合特征选择方法具有较好的性能,并且能够有效降低特征维数。

关 键 词:文本情感分类  联合特征选择  相关性  冗余特征

Joint Feature Selection Method Based on Relevance and Redundancy
ZHOU Cheng , GE Bin , TANG Jiu-yang , XIAO Wei-dong.Joint Feature Selection Method Based on Relevance and Redundancy[J].Computer Science,2012,39(4):181-184.
Authors:ZHOU Cheng  GE Bin  TANG Jiu-yang  XIAO Wei-dong
Affiliation:(Science and Technology on Information Systems Engineering Laboratory,National University of Defense Technology,Changsha 410073,China)
Abstract:Based on a comparative study of four feature selection methods,including document frequency(DF) unrelated to class information,and information gain(IG),mutual information(MI) and chi-square statistic(CHI),which are relatedto class information,we analyzed the disadvantages of combining these two kinds of methods directly and proposed a joint feature selection method based on relevance and redundancy to joint DF and one of IG,MI and CHI.This approach aims to eliminate redundant features,find useful features for classification and consequently improve the accuracy of text sentiment classification.The results of the experiment show that the proposed method can not only improve the performance but also reduce the feature dimension.
Keywords:Text sentiment classification  Joint feature selection  Rclevance  Redundant feature
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号