首页 | 本学科首页   官方微博 | 高级检索  
     

高效率的K-means最佳聚类数确定算法
引用本文:王 勇,唐 靖,饶勤菲,袁巢燕. 高效率的K-means最佳聚类数确定算法[J]. 计算机应用, 2014, 34(5): 1331-1335. DOI: 10.11772/j.issn.1001-9081.2014.05.1331
作者姓名:王 勇  唐 靖  饶勤菲  袁巢燕
作者单位:重庆理工大学 计算机科学与工程学院,重庆 400054
基金项目:重庆市教委资助项目;重庆理工大学研究生创新基金资助项目
摘    要:针对K-means聚类算法通常无法事先设定聚类数,而人为设定初始聚类数目容易导致聚类结果不够稳定的问题,提出一种新的高效率的K-means最佳聚类数确定算法。该算法通过样本数据分层来得到聚类数搜索范围的上界,并设计了一种聚类有效性指标来评价聚类后类内与类间的相似性程度,从而在聚类数搜索范围内获得最佳聚类数。仿真实验结果表明,该算法能够快速、高效地获得最佳聚类数,对数据集聚类效果良好。

关 键 词:K-means聚类  数据分层  聚类有效性指标  相似性程度  最佳聚类数
收稿时间:2013-11-25
修稿时间:2013-12-25

High efficient K-means algorithm for determining optimal number of clusters
WANG Yong TANG Jing RAO Qinfei YUAN Chaoyan. High efficient K-means algorithm for determining optimal number of clusters[J]. Journal of Computer Applications, 2014, 34(5): 1331-1335. DOI: 10.11772/j.issn.1001-9081.2014.05.1331
Authors:WANG Yong TANG Jing RAO Qinfei YUAN Chaoyan
Affiliation:College of Computer Science and Engineering, Chongqing University of Technology, Chongqing 400054, China
Abstract:The cluster number is not generally set by K-means clustering algorithm beforehand, and artificial initial clustering number easily leads to the problem of unstable clustering results. A high-efficient algorithm for determining the K-means optimal clustering number was presented. The algorithm got the upper bound of the number of clustering search range through stratified sample data and designed a new kind of effective clustering indicator to evaluate the clustering degree of similarity between and within class after clustering. Thus the optimal number of clusters was obtained in the search range of the clusters number. The simulation results show that the algorithm can obtain the optimal clustering number fast and accurately, and the dataset clustering effect is good.
Keywords:
本文献已被 CNKI 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号