首页 | 本学科首页   官方微博 | 高级检索  
     

基于粗糙集特征加权的文本分类
引用本文:徐欣,黄理灿,赵玉虹.基于粗糙集特征加权的文本分类[J].浙江理工大学学报,2011,28(4).
作者姓名:徐欣  黄理灿  赵玉虹
作者单位:浙江理工大学信息电子学院,杭州,310018
基金项目:浙江省“钱江人才计划”项目
摘    要:文本分类是当今信息检索和数据挖掘等领域的研究热点,而特征加权是文本分类过程中的重要步骤.为了提高分类质量,文章通过深入分析粗糙集理论和逆文本频率加权的思想,提出了一种基于粗糙集的特征加权方法,从近似分类精度和近似分类质量两个方面考虑特征词对分类的全局作用,将文本的类别属性信息引入到权重中.通过文本分类实验证明,该加权方法有助于提高分类系统的分类效果.

关 键 词:粗糙集理论  特征加权  文本分类  近似分类精度  近似分类质量

Text Categorization by Feature Weighting Scheme Based on Rough Set
XU Xin,HUANG Li-can,ZHAO Yu-hong.Text Categorization by Feature Weighting Scheme Based on Rough Set[J].Journal of Zhejiang Sci-tech University,2011,28(4).
Authors:XU Xin  HUANG Li-can  ZHAO Yu-hong
Affiliation:XU Xin,HUANG Li-can,ZHAO Yu-hong(School of Informatics and Electronics,Zhejiang Sci-Tech University,Hangzhou 310018,China)
Abstract:Text Categorization is the focus of many areas like Information Retrieval,Data Mining and so on.Feature weighting is an important problem in text categorization.For computing feature weights,this paper presents a feature weighting scheme for text categorization based on rough set theory.The authors analyze the characteristics of rough set theory and TF-IDF,and consider the overall influence which the keywords establish over the classification from the aspects of approximation accuracy and approximation qual...
Keywords:rough set  feature weighting  text categorization  approximation accuracy  approximation quality  
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号