首页 | 本学科首页   官方微博 | 高级检索  
     

一种改进的文本特征选择方法的研究与设计
引用本文:符会涛,卡米力·木衣丁.一种改进的文本特征选择方法的研究与设计[J].计算机应用与软件,2011,28(4).
作者姓名:符会涛  卡米力·木衣丁
作者单位:新疆大学信息科学与工程学院,新疆,乌鲁木齐,830046
摘    要:分析了特征选择采用互信息方法时文本分类性能较低的原因,认为与其在特征选择时倾向于选择稀有特征这一缺陷有很大关系。在此基础上,提出了一种基于分散度和平均频度的互信息特征选择方法。实验结果表明,改进后的互信息方法使文本分类性能有明显提高。

关 键 词:特征选择  互信息  文本分类  

STUDY AND DESIGN OF AN IMPROVED TEXT FEATURE SELECTION METHOD
Fu Huitao,Kamil Moydin.STUDY AND DESIGN OF AN IMPROVED TEXT FEATURE SELECTION METHOD[J].Computer Applications and Software,2011,28(4).
Authors:Fu Huitao  Kamil Moydin
Affiliation:Fu Huitao Kamil Moydin(School of Information Science and Engineering,Xinjiang University,Urumqi 830046,Xinjiang,China)
Abstract:The article explains why text classification performance is low when mutual information method is adopted in feature selection,asserts that it is largely due to the flaw of selection of rare feature when making feature selections.Next a mutual information feature selection method based on distributed degree and average frequency is proposed.Experimental results show that the improved mutual information method can significantly improve the text classification performance.
Keywords:Feature selection Mutual information Text classification  
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号