首页 | 本学科首页   官方微博 | 高级检索  
     


Similarity matrix-based K-means algorithm for text clustering
Authors:CAO Qi-min  GUO Qiao  WU Xiang-hua
Affiliation:School of Automation, Beijing Institute of Technology, Beijing 100081 ,China
Abstract:K-means algorithm is one of the most widely used algorithms in the clustering analysis. To deal with the problem caused by the random selection of initial center points in the traditional algorithm, this paper proposes an improved K-means algorithm based on the similarity matrix. The improved algorithm can effectively avoid the random selection of initial center points, therefore it can provide effective initial points for clustering process, and reduce the fluctuation of clustering results which are resulted from initial points selections, thus a better clustering quality can be obtained. The experimental results also show that the F-measure of the improved K-means algorithm has been greatly improved and the clustering results are more stable.
Keywords:text clustering  K-means algorithm  similarity matrix  F-measure
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《北京理工大学学报(英文版)》浏览原始摘要信息
点击此处可从《北京理工大学学报(英文版)》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号