首页 | 本学科首页   官方微博 | 高级检索  
     

基于密度敏感距离的改进模糊C均值聚类算法
引用本文:王治和,王淑艳,杜辉. 基于密度敏感距离的改进模糊C均值聚类算法[J]. 计算机工程, 2021, 47(5): 88-96,103. DOI: 10.19678/j.issn.1000-3428.0057901
作者姓名:王治和  王淑艳  杜辉
作者单位:西北师范大学 计算机科学与工程学院, 兰州 730070
摘    要:模糊C均值(FCM)聚类算法无法识别非凸数据,算法中基于欧式距离的相似性度量只考虑数据点之间的局部一致性特征而忽略了全局一致性特征。提出一种利用密度敏感距离度量创建相似度矩阵的FCM算法。通过近邻传播算法获取粗类数作为最佳聚类数的搜索范围上限,以解决FCM算法聚类数目需要人为预先设定和随机选定初始聚类中心造成聚类结果不稳定的问题。在此基础上,改进最大最小距离算法,得到具有代表性的样本点作为初始聚类中心,并结合轮廓系数自动确定最佳聚类数。基于UCI数据集和人工数据集的实验结果表明,相比经典FCM、K-means和CFSFDP算法,该算法不仅具有识别复杂非凸数据的能力,而且能够在保证聚类性能和稳定性的前提下加快收敛速度。

关 键 词:模糊C均值聚类算法  密度敏感距离  近邻传播  初始聚类中心  轮廓系数  
收稿时间:2020-03-30
修稿时间:2020-04-30

Improved Fuzzy C-means Clustering Algorithm Based on Density-Sensitive Distance
WANG Zhihe,WANG Shuyan,DU Hui. Improved Fuzzy C-means Clustering Algorithm Based on Density-Sensitive Distance[J]. Computer Engineering, 2021, 47(5): 88-96,103. DOI: 10.19678/j.issn.1000-3428.0057901
Authors:WANG Zhihe  WANG Shuyan  DU Hui
Affiliation:School of Computer Science and Engineering, Northwest Normal University, Lanzhou 730070, China
Abstract:The Fuzzy C-means(FCM)clustering algorithm cannot identify non-convex data,and its similarity measure based on Euclidean distance only considers the local consistency feature between data points while ignoring the global consistency feature.To address the problem,this paper proposes an improved FCM algorithm that uses the density-sensitive distance measure to create the similarity matrix.The proposed algorithm employs the Affinity Propagation(AP)algorithm to obtain the coarse number of clusters as the upper limit of the search of the optimal cluster number,to avoid the instability of clustering results of the classical FCM algorithm,which requires the clustering number to be manually set in advance and initial clustering center to be randomly selected.On this basis,the maximum and minimum distance algorithm is improved to obtain representative sample points as the initial clustering center,and the optimal cluster number is determined based on the silhouette coefficient.Experimental results on UCI and artificial data sets show that compared with the classical FCM,K-means and CFSFDP algorithms,the proposed algorithm is capable of identifying complex non-convex data,and improves the convergence speed with ensured clustering performance and stability.
Keywords:Fuzzy C-means(FCM) clustering algorithm  density-sensitive distance  Affinity Propagation(AP)  initial clustering center  silhouette coefficient  
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《计算机工程》浏览原始摘要信息
点击此处可从《计算机工程》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号