基于Rough集潜在语义索引的Web文档分类 Web Document Classification Based on Rough Set Latent Semantic Indexing期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于Rough集潜在语义索引的Web文档分类

引用本文：	何明,冯博琴,傅向华. 基于Rough集潜在语义索引的Web文档分类[J]. 计算机工程, 2004, 30(13): 3-5

作者姓名：	何明冯博琴傅向华

作者单位：	西安交通大学计算机科学与技术系,西安,710049;西安交通大学计算机科学与技术系,西安,710049;西安交通大学计算机科学与技术系,西安,710049

摘要：	Rough集(粗糙集)埋论是一种处理不确定或模糊知识的数学工具。提出了一种基于Rough集理论的潜在语义索引的Web文档分类方法。首先应用向量空间模型表示Web文档信息，然后通过矩阵的奇异值分解来进行信息过滤和潜在语义索引；运用属性约简算法生成分类规则，最后利用多知识库进行文档分类。通过试验比较，该方法具有较好的分类效果。
关键词：	粗糙集潜在语义索引 Web文档分类信息过滤信息检索
文章编号：	1000-3428(2004)13-0003-03
修稿时间：	2003-06-06
Web Document Classification Based on Rough Set Latent Semantic Indexing

HE Ming,FENG Boqin,FU Xianghua. Web Document Classification Based on Rough Set Latent Semantic Indexing[J]. Computer Engineering, 2004, 30(13): 3-5

Authors:	HE Ming FENG Boqin FU Xianghua

Abstract:	Rough set theory is a mathematical tool to deal with uncertain or vague knowledge. An approach to Web document classification based on rough set latent semantic indexing is proposed. Firstly, Web documents, which are denoted by vector space model reduced document feature set. Then, information filtering and latent semantic indexing are conducted by singular value decomposition of matrix. Generating classification rule by attribution reduces algorithm. Finally, the documents are classified with multiple knowledge bases. The experiment results and the comparison with others show tha this Web document classification has good classification performance.

Keywords:	s Rough set Latent semantic indexing Web document classification Information filtering Information retrieval
本文献已被 CNKI 维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏