首页 | 本学科首页   官方微博 | 高级检索  
     

基于特征噪声加权的特征权重算法改进
引用本文:赵航,杨天奇,赵小厦. 基于特征噪声加权的特征权重算法改进[J]. 微型机与应用, 2012, 31(3): 66-68
作者姓名:赵航  杨天奇  赵小厦
作者单位:1. 暨南大学信息科学技术学院,广东广州,510632
2. .华南师范大学计算机学院,广东广州,510631
摘    要:特征权重算法TF—IDF是文本分类的重要算法之一,该算法IDF值容易受特征噪声影响出现波动。提出一种基于特征噪声加权的特征权重改进算法,该算法通过分析噪声特征的分布特点,对不能准确表达文档真实意思的特征噪声进行加权,降低特征噪声对IDF的影响,最终有效地提高算法的精度和健壮性。

关 键 词:向量空间模型  文本分类  特征噪声  特征权重  健壮性

Feature weight algorithm based on feature noise weighting
Zhao Hang,Yang Tianqi,Zhao Xiaoxia. Feature weight algorithm based on feature noise weighting[J]. Microcomputer & its Applications, 2012, 31(3): 66-68
Authors:Zhao Hang  Yang Tianqi  Zhao Xiaoxia
Affiliation:1.College of Information Science and Technology,Jinan University,Guangzhou 510632,China; 2.College of Computer,South China Normal University,Guangzhou 510631,China)
Abstract:The algorithm of term weighting TF-IDF is one of the most important algorithm, but it fluctuates greatly when affected by the term noises. The paper proposes a feature weight algorithm basing on feature noise weighting. This algorithm analyses the distribution features of the term noises and weights the term noise which can′t express the true meaning of the author in the document. Thereby the influence on the IDF is reduced, which is caused by the term noise. Finally the precision and the robustness are improved obviously.
Keywords:VSM  text classification  feature noise  feature weighting  robustness
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号