首页 | 本学科首页   官方微博 | 高级检索  
     

一种改进的互信息特征选择方法
引用本文:康岚兰,董丹丹. 一种改进的互信息特征选择方法[J]. 数字社区&智能家居, 2009, 0(35)
作者姓名:康岚兰  董丹丹
作者单位:江西理工大学应用科学学院;
摘    要:特征选择是中文文本自动分类领域中极其重要的研究内容,其目的是为了解决特征空间高维性和文档表示向量稀疏性之间的矛盾。针对互信息(MI)特征选择方法分类效果较差的现状,提出了一种改进的互信息特征选择方法IMI。该方法考虑了特征项在当前文本中出现的频率以及互信息值为负数情况下的特征选取,从而能更有效地过滤低频词。通过在自动分类器KNN上的实验表明,改进后的方法极大地提高了分类精度。

关 键 词:中文文本自动分类  特征选择  互信息  

An Improved Feature Selection Algorithm Based on Mutual Information
KANG Lan-lan,DONG Dan-dan. An Improved Feature Selection Algorithm Based on Mutual Information[J]. Digital Community & Smart Home, 2009, 0(35)
Authors:KANG Lan-lan  DONG Dan-dan
Affiliation:KANG Lan-lan,DONG Dan-dan (Faculty of Applied Science,Jiangxi University of Science , Technology,Ganzhou 341000,China)
Abstract:Feature selection is extremely important research of automatic categorization, and its purpose is to solve the contradiction between the high dimensional feature space and sparse vector of the document. For the less effective classification results of mutual information feature selection method, an improved mutual information feature selection method, IMI,was presented. This method not only takes into the current frequency of feature in text, but also takes into the case of mutual information value is negat...
Keywords:automatic categorization  feature selection  mutual information  
本文献已被 CNKI 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号