首页 | 本学科首页   官方微博 | 高级检索  
     

一种改进的基尼指数特征权重算法
引用本文:任国锋,李德华,潘莹. 一种改进的基尼指数特征权重算法[J]. 计算机与数字工程, 2010, 38(12)
作者姓名:任国锋  李德华  潘莹
作者单位:华中科技大学图像识别与人工智能所;广西大学信息网络中心;
摘    要:文本分类中普遍应用的TF-IDF特征权重算法没有引入特征项的纯度和类别属性.在结合基尼指数原理和TF-IDF特征权重算法基础上,提出一种基于基尼指数的特征权重改进算法,在计算特征权重时引入特征项的纯度和分类的已知类别属性.进一步,设计了两种特征权重算法的对比实验,并在SVM分类器和kNN分类器下选取不同的特征项数目进行多次实验.实验结果表明,该改进的基尼指数特征权重算法有更好的效果.

关 键 词:改进算法  指数特征  权重算法  Gini  Index  Based  特征权重  基尼指数  特征项  类别属性  SVM分类器  实验结果  指数原理  文本分类  对比实验  纯度  kNN分类  设计  计算  基础

An Improved Algorithm for Feature Weight Based on Gini Index
Ren Guofeng,Li Dehua,Pan Ying. An Improved Algorithm for Feature Weight Based on Gini Index[J]. Computer and Digital Engineering, 2010, 38(12)
Authors:Ren Guofeng  Li Dehua  Pan Ying
Affiliation:Ren Guofeng1) Li Dehua1) Pan Ying1),2)(Institute for Pattern Recognition and Artificial Intelligence,Huazhong University of Science and Technology1),Wuhan 430074)(Information Network Center,Guangxi University2),Nanning 530004)
Abstract:The universally used TF-IDF feature weight algorithm in the text categorization does neither introduce the purity of feature term,nor the known category property.So an improved feature weight algorithm based on the theory of gini index is proposed in the paper,which takes the purity of feature term and the known category property into account.And then experiments are designed to compare the improved algorithm with the TF-IDF algorithm,with different feature numbers in the SVM and the kNN classifier.The resu...
Keywords:text categorization  feature weight  gini index  
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号