首页 | 本学科首页   官方微博 | 高级检索  
     

一种改进的基于潜在语义索引的文本聚类算法
引用本文:侯泽民,巨筱.一种改进的基于潜在语义索引的文本聚类算法[J].计算机与现代化,2014,0(7):24-27.
作者姓名:侯泽民  巨筱
作者单位:郑州科技学院信息工程学院,河南郑州450064
基金项目:郑州市科技局自然科学基金资助项目(201210439)
摘    要:提出一种改进的基于潜在语义索引的文本聚类算法。算法引入潜在语义索引理论,改进传统的SOM算法。用潜在语义索引理论表示文本特征向量,挖掘文本中词与词之间隐藏的语义结构关系,从而消除词语之间的相关性,实现特征向量的降维。改进传统的SOM算法的局限性,准确给出聚类类别数目的值。实验结果表明,本算法的聚类效果更好,聚类时间更少。

关 键 词:文本聚类  潜在语义索引  自组织映射
收稿时间:2014-07-17

An Improved Text Clustering Algorithm Based on Latent Semantic Indexing
HOU Ze-min,JU Xiao.An Improved Text Clustering Algorithm Based on Latent Semantic Indexing[J].Computer and Modernization,2014,0(7):24-27.
Authors:HOU Ze-min  JU Xiao
Affiliation:(Department of Information Engineering, University for Science & Technology Zhengzhou, Zhengzhou 450064, China)
Abstract:This paper presents an improved text clustering algorithm based on latent semantic indexing .This algorithm introduces the theory of latent semantic index , improves the traditional SOM algorithm .By using the latent semantic indexing text feature vector representation theory , we mine the semantic structure relationships hidden among the words in text , thereby eliminating the correlation among words , to reduce the feature vector dimension .The limitations of the traditional SOM algorithm are improved to accurately give the number of clustering classes .Experimental results show that the clustering effect of this algorithm is better , and the clustering time is less .
Keywords:text clustering  latent semantic index  self-organizing maps
本文献已被 维普 等数据库收录!
点击此处可从《计算机与现代化》浏览原始摘要信息
点击此处可从《计算机与现代化》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号