首页 | 官方网站   微博 | 高级检索  
     

一种基于K近邻的比较密度峰值聚类算法
引用本文:杜沛,程晓荣.一种基于K近邻的比较密度峰值聚类算法[J].计算机工程与应用,2019,55(10):161-168.
作者姓名:杜沛  程晓荣
作者单位:华北电力大学 控制与计算机工程学院,河北 保定,071000;华北电力大学 控制与计算机工程学院,河北 保定,071000
基金项目:中央高校基本科研业务费专项资金
摘    要:快速搜索与发现密度峰值聚类算法(Fast Search and Discovery Density Peak Clustering Algorithm,CFSFDP)的聚类效果十分依赖截断距离dc的主观选取,而最佳dc值的确定并不容易,并且当处理分布复杂、密度变化大的数据集时,算法生成的决策图中类簇中心点与非类簇中心点的区分不够明显,使类簇中心的选取变得困难。针对这些问题,对其算法进行了优化,并提出了基于K近邻的比较密度峰值聚类算法(Comparative Density Peak Clustering algorithm Based on K-Nearest Neighbors,CDPC-KNN)。算法结合K近邻概念重新定义了截断距离和局部密度的度量方法,对任意数据集能自适应地生成截断距离,并使局部密度的计算结果更符合数据的真实分布。同时在决策图中引入距离比较量代替原距离参数,使类簇中心在决策图上更加明显。通过实验验证,CDPC-KNN算法的聚类效果整体上优于CFSFDP算法与DBSCAN算法,分离度实验表明新算法使类簇中心与非类簇中心点的区分度得到有效提高。

关 键 词:聚类算法  密度峰值  K近邻  决策图  类簇中心

Comparative Density Peaks Clustering Based on K-Nearest Neighbors
DU Pei,CHENG Xiaorong.Comparative Density Peaks Clustering Based on K-Nearest Neighbors[J].Computer Engineering and Applications,2019,55(10):161-168.
Authors:DU Pei  CHENG Xiaorong
Affiliation:School of Control and Computer Engineering, North China Electric Power University, Baoding, Hebei 071000, China
Abstract:The clustering effect of the Fast Search and Discovery Density Peak Clustering Algorithm(CFSFDP) relies heavily on the subjective setting of the truncation distance dc], while the determination of the optimum value is not easy, and when dealing with the data sets with complex structure and large variations in density, the distinction generated by CFSFDP algorithm between the cluster center points and the non-cluster center points in the decision graph is not obvious enough, making the selection of the cluster centers difficult. Aiming at these problems, the algorithm is optimized and a Comparative Density Peak Clustering algorithm based on K-Nearest Neighbors(CDPC-KNN) is proposed. The algorithm combines the concept of K-nearest neighbors to redefine the measurement method of truncation distance and local density. It can adaptively generate the truncation distance for arbitrary datasets, and make the calculation results of local density more consistent with the real distribution of data. Meanwhile, the distance comparison quantity is introduced to replace the distance parameter, so that the cluster centers are more obvious on the decision graph. The experimental results show that the clustering effect of CDPC-KNN algorithm is better than CFSFDP algorithm and DBSCAN algorithm in general. The separation experiment shows that the new algorithm effectively improves the discrimination between cluster center points and non-cluster center points.
Keywords:clustering algorithm  density peaks  K-nearest neighbors  decision graph  cluster centers  
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号