基于可拓距的改进k-means聚类算法 Improved k-means algorithm based on extension distance期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于可拓距的改进k-means聚类算法

引用本文：	赵燕伟,朱芬,桂方志,任设东,谢智伟,徐晨.基于可拓距的改进k-means聚类算法[J].智能系统学报,2020,15(2):344-351.

作者姓名：	赵燕伟朱芬桂方志任设东谢智伟徐晨

作者单位：	1. 浙江工业大学特种装备制造与先进加工技术教育部/浙江省重点实验室, 浙江杭州 310014;2. 浙江业大学计算机科学与技术学院, 浙江杭州 310014

摘要：	针对现有聚类算法在初始聚类中心优化过程中存在首个初始聚类中心点落于边界非密集区域的不足，导致出现算法聚类效果不均衡问题，提出一种基于可拓距优选初始聚类中心的改进k-means算法。将样本经典距离向可拓区间映射，并通过可拓侧距计算方法得到可拓左侧距及可拓右侧距；引入平均可拓侧距概念，将平均可拓左侧距和平均可拓右侧距分别作为样本密集度和聚类中心疏远度的量化指标；在此基础上，给出初始聚类中心选取准则。通过与传统k-means聚类算法进行对比，结果表明改进后的k-means聚类算法选取的初始聚类中心分布更加均匀，聚类效果更好，尤其在对高维数据聚类时具有更高的聚类准确率和更好的均衡性。
关键词：	可拓距 k-means聚类算法缩放因子初始聚类中心密集度疏远度
Improved k-means algorithm based on extension distance

ZHAO Yanwei,ZHU Fen,GUI Fangzhi,REN Shedong,XIE Zhiwei,XU Chen.Improved k-means algorithm based on extension distance[J].CAAL Transactions on Intelligent Systems,2020,15(2):344-351.

Authors:	ZHAO Yanwei ZHU Fen GUI Fangzhi REN Shedong XIE Zhiwei XU Chen

Affiliation:	1. Key Lab of Special Purpose Equipment and Advanced Manufacturing Technology, Ministry of Education & Zhejiang Province, Zhejiang University of Technology, Hangzhou 310014, China;2. College of Computer Science and Technology, Zhejiang University of T

Abstract:	An improved k -means algorithm optimizing the initial cluster centers based on extension distance was proposed to solve several problems that lead to clustering imbalance of the algorithm, such as the poor quality of initial cluster center selection or the first initial cluster center easily falling into the non-dense area of the data boundary. First, the classical distance of the sample was mapped onto the extension interval, and the extension left-side and right-side distances were obtained using the extension distance calculation method. Then, the average extension side distance was determined, and the extension left-side and right-side distances were taken as the quantitative indicators of sample density and cluster center distance, respectively. Subsequently, the selection criteria of the initial cluster center were given. Finally, compared with the traditional k-means algorithm, the improved k-means algorithm obtained higher clustering accuracy and better balance, particularly in high-dimensional data clustering.

Keywords:	extension distance k-means clustering algorithmk-means clustering algorithm scaling factor initial cluster center intensity alienation

	点击此处可从《智能系统学报》浏览原始摘要信息
	点击此处可从《智能系统学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏