基于网络舆情的K-Means算法的改进研究 The Improvement of K-Means Clustering Algorithm based on Internet Public Opinion期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于网络舆情的K-Means算法的改进研究

引用本文：	罗晖霞,曲晓玲.基于网络舆情的K-Means算法的改进研究[J].电脑开发与应用,2010,23(8):4-6,15.

作者姓名：	罗晖霞曲晓玲

作者单位：	1. 中北大学,太原,030051 2. 山西省政府办公厅,太原,030002

摘要：	传统的K-Means聚类算法只能保证收敛到局部最优,从而导致聚类结果对初始代表点的选择非常敏感;凝聚层次聚类虽无需选择初始的聚类中心,但计算复杂度较高,而且凝聚过程不可逆。结合网络舆情的特点,深入剖析了K-Means聚类算法和凝聚层次聚类算法的优缺点,对K-Means聚类算法进行改进。改进后算法的核心思想是,结合两种算法分别在初始点选择和聚类过程两个方面的优势,进行整合优化。通过实验分析及实际应用表明,改进后的文本聚类算法在很大程度上可以提高网络舆情信息聚类结果的准确性、有效性以及算法的效率。
关键词：	网络舆情文本聚类 K-Means算法凝聚层次聚类聚类过程
The Improvement of K-Means Clustering Algorithm based on Internet Public Opinion

Abstract:	The traditional K-Means clustering algorithm can only ensure the convergence to a local optimum,leading to the initial clustering results are very sensitive to the choice of representative points.Agglomerative hierarchical clustering option to eliminate the initial cluster centers can be automatically generated for text set at different levels of clustering model,but it is higher in computational complexity,and irreversible aggregation.In this article,analysis deeply the advantages and disadvantages of the K-Means clustering algorithm and agglomerative hierarchical clustering algorithm according to the network characteristics of public opinion,and improving the K-Means clustering algorithm.The core idea of the improved algorithm is combining the advantages of two algorithms at the initial point selection and clustering processes,making integration optimization.Through practical application shows that the improved algorithm can improve the quality and efficiency of the network public opinion information and clustering results.

Keywords:	internet public opinion text clustering K-Means algorithm hierarchical agglomerative clustering clustering process
本文献已被维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏