首页 | 本学科首页   官方微博 | 高级检索  
     

基于可拓距的改进k-means聚类算法
引用本文:赵燕伟,朱芬,桂方志,任设东,谢智伟,徐晨.基于可拓距的改进k-means聚类算法[J].智能系统学报,2020,15(2):344-351.
作者姓名:赵燕伟  朱芬  桂方志  任设东  谢智伟  徐晨
作者单位:1. 浙江工业大学 特种装备制造与先进加工技术教育部/浙江省重点实验室, 浙江 杭州 310014;2. 浙江业大学 计算机科学与技术学院, 浙江 杭州 310014
摘    要:针对现有聚类算法在初始聚类中心优化过程中存在首个初始聚类中心点落于边界非密集区域的不足,导致出现算法聚类效果不均衡问题,提出一种基于可拓距优选初始聚类中心的改进k-means算法。将样本经典距离向可拓区间映射,并通过可拓侧距计算方法得到可拓左侧距及可拓右侧距;引入平均可拓侧距概念,将平均可拓左侧距和平均可拓右侧距分别作为样本密集度和聚类中心疏远度的量化指标;在此基础上,给出初始聚类中心选取准则。通过与传统k-means聚类算法进行对比,结果表明改进后的k-means聚类算法选取的初始聚类中心分布更加均匀,聚类效果更好,尤其在对高维数据聚类时具有更高的聚类准确率和更好的均衡性。

关 键 词:可拓距  k-means聚类算法  缩放因子  初始聚类中心  密集度  疏远度

Improved k-means algorithm based on extension distance
ZHAO Yanwei,ZHU Fen,GUI Fangzhi,REN Shedong,XIE Zhiwei,XU Chen.Improved k-means algorithm based on extension distance[J].CAAL Transactions on Intelligent Systems,2020,15(2):344-351.
Authors:ZHAO Yanwei  ZHU Fen  GUI Fangzhi  REN Shedong  XIE Zhiwei  XU Chen
Affiliation:1. Key Lab of Special Purpose Equipment and Advanced Manufacturing Technology, Ministry of Education & Zhejiang Province, Zhejiang University of Technology, Hangzhou 310014, China;2. College of Computer Science and Technology, Zhejiang University of T
Abstract:An improved k -means algorithm optimizing the initial cluster centers based on extension distance was proposed to solve several problems that lead to clustering imbalance of the algorithm, such as the poor quality of initial cluster center selection or the first initial cluster center easily falling into the non-dense area of the data boundary. First, the classical distance of the sample was mapped onto the extension interval, and the extension left-side and right-side distances were obtained using the extension distance calculation method. Then, the average extension side distance was determined, and the extension left-side and right-side distances were taken as the quantitative indicators of sample density and cluster center distance, respectively. Subsequently, the selection criteria of the initial cluster center were given. Finally, compared with the traditional k-means algorithm, the improved k-means algorithm obtained higher clustering accuracy and better balance, particularly in high-dimensional data clustering.
Keywords:extension distance  k-means clustering algorithmk-means clustering algorithm  scaling factor  initial cluster center  intensity  alienation
点击此处可从《智能系统学报》浏览原始摘要信息
点击此处可从《智能系统学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号