首页 | 本学科首页   官方微博 | 高级检索  
     

基于平均密度优化初始聚类中心的k-means算法
引用本文:邢长征,谷浩.基于平均密度优化初始聚类中心的k-means算法[J].计算机工程与应用,2014,50(20):135-138.
作者姓名:邢长征  谷浩
作者单位:辽宁工程技术大学 电子与信息工程学院,辽宁 葫芦岛 125105
摘    要:现有的基于密度优化初始聚类中心的k-means算法存在聚类中心的搜索范围大、消耗时间久以及聚类结果对孤立点敏感等问题,针对这些问题,提出了一种基于平均密度优化初始聚类中心的k-means算法adk-means。该算法将数据集中的孤立点划分出来,计算出剩余数据集样本的平均密度,孤立点不参与聚类过程中各类所含样本均值的计算;在大于平均密度的密度参数集合中选择聚类中心,根据最小距离原则将孤立点分配给离它最近的聚类中心,直至将数据集完整分类。实验结果表明,这种基于平均密度优化初始聚类中心的k-means算法比现有的基于密度的k-means算法有更快的收敛速度,更强的稳定性及更高的聚类精度,消除了聚类结果对孤立点的敏感性。

关 键 词:k-means算法  聚类中心  平均密度  孤立点  收敛  

K-means algorithm based on average density optimizing initial cluster centre
XING Changzheng,GU Hao.K-means algorithm based on average density optimizing initial cluster centre[J].Computer Engineering and Applications,2014,50(20):135-138.
Authors:XING Changzheng  GU Hao
Affiliation:School of Electronic and Information Engineering, Liaoning Technical University, Huludao, Liaoning 125105, China
Abstract:The existing k-means algorithms based on the density optimization are of the large search range, long time-consuming, and the clustering results are sensitive to isolated points. A k-means algorithm based on the average density optimizing the initial cluster centre, adk-means, is proposed to solve these problems. The isolated points are divided out from data set, and the average density of the remaining sample of data set is calculated out without involving of the isolated points. The isolated points are also ignored in the calculation of all other kinds of sample average in the process of clustering. Then it selects the centre of cluster from the density parameter set whose density is greater than the average density. The isolated point is assigned to the nearest cluster centre according to the principle of minimum distance, until the clustering is com-pletely done. The experimental results show that, the average density based K-means algorithm of optimal initial cluster-ing centre has faster convergence speed, better stability and higher clustering accuracy than the existing density based k-means algorithm, and eliminates the problem that the clustering results are sensitive to isolated points.
Keywords:k-means algorithm  clustering centre  average density  isolated points  convergence
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号