首页 | 本学科首页   官方微博 | 高级检索  
     

常用特征选择方法的比较研究
引用本文:康岚兰,董丹丹.常用特征选择方法的比较研究[J].数字社区&智能家居,2009,5(12):9787-9789.
作者姓名:康岚兰  董丹丹
作者单位:江西理工大学应用科学学院,江西赣州341000
摘    要:特征选择是中文文本自动分类领域中极其重要的研究内容,其目的是为了解决特征空间高维性和文档表示向量稀疏性之间的矛盾。常用的特征选择方法有:文档频数、信息增益、互信息、期望交叉熵、卡方统计量和文本证据权等。在该本自动分类器KNN上对以上方法进行了比较研究,分析了各个特征评估函数的优劣,检测了这些方法在特征维数变化情况下的性能。

关 键 词:中文文本自动分类  特征选择  特征评估函数  性能

Comparative Research on Methods of Feature Selection
KANG Lan-lan,DONG Dan-dan.Comparative Research on Methods of Feature Selection[J].Digital Community & Smart Home,2009,5(12):9787-9789.
Authors:KANG Lan-lan  DONG Dan-dan
Affiliation:(Faculty of Applied Science, Jiangxi University of Science and Technology, Ganzhou 341000, China)
Abstract:Feature selection are the field of automatic classification of the research is extremely important, and its purpose is to solve the high dimensional feature space and sparse document vector express the contradictions between. Commonly used feature selection methods: Document Frequency, Information Gain, Mutual Information, Expected Cross Entropy, chi and Weight of Evidence for Text. Automatically in the text of this article KNN classifier on the above comparative study of methods to analyze the characteristics of the various advantages and disadvantages of the assessment function, to detect the characteristics of these methods in the Change dimension of performance.
Keywords:Chinese text Auto classification  feature selection  feature assessment function  performance
本文献已被 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号