首页 | 本学科首页   官方微博 | 高级检索  
     

基于最小方差的自适应K-均值初始化方法
引用本文:肖洋,李平,王鹏,邱宁佳. 基于最小方差的自适应K-均值初始化方法[J]. 长春理工大学学报(自然科学版), 2015, 0(5). DOI: 10.3969/j.issn.1672-9870.2015.05.032
作者姓名:肖洋  李平  王鹏  邱宁佳
作者单位:长春理工大学 计算机科学技术学院,长春,130022
摘    要:K-均值算法对初始聚类中心敏感,聚类结果随不同初始聚类中心波动。针对以上问题,提出一种基于最小方差的自适应K-均值初始化方法,使初始聚类中心分布在K个不同样本密集区域,聚类结果收敛到全局最优。首先,根据样本空间分布信息,计算样本方差得到样本紧密度信息,并基于样本紧密度选出满足条件的候选初始聚类中心;然后,对候选初始聚类中心进行处理,筛选出K个初始聚类中心。实验证明,算法具有较高的聚类性能,对噪声和孤立点具有较好的鲁棒性,且适合对大规模数据集聚类。

关 键 词:聚类  K-均值  方差  初始聚类中心

An Adaptive K-means Initialization Method Based on Minimum Deviation
Abstract:K-means algorithm is sensitive to the initial cluster center;fluctuation of clustering results are following with different initial cluster centers. To solve these problems, in this paper, an adaptive K-means initialization method is proposed based on minimum variance;the initial clustering center is distributed in the K different sample density re-gions,clustering results of convergence to the global optimum. Firstly,according to the information of the space distri-bution of samples, the information of samples close degree is got by calculation of sample variance. In addition, based on samples close degree the qualified candidate initial cluster centers is selected;Then,the candidate initial cluster cen-ters are dealt with and k initial cluster centers are filtered. The experiment proved that this algorithm has high cluster-ing performance and good robustness for processing of the noise and the isolated point;it is suitable for clustering the large-scale data set.
Keywords:clustering  K-means  deviation  initialized clustering centers
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号