首页 | 本学科首页   官方微博 | 高级检索  
     

基于网格近邻优化的密度峰值聚类算法
引用本文:刘继,杨金瑞.基于网格近邻优化的密度峰值聚类算法[J].计算机应用研究,2024,41(4):1058-1063.
作者姓名:刘继  杨金瑞
作者单位:1. 新疆财经大学统计与数据科学学院;2. 新疆财经大学新疆社会经济统计与大数据应用研究中心
基金项目:国家自然社科基金资助项目(72164034)
摘    要:密度峰值聚类(DPC)将数据样本点的局部密度和相对距离进行结合,能对任意形状数据集进行聚类处理,但密度峰值聚类算法存在主观选择截断距离、简单分配策略和较高时间复杂度等问题。为此,提出了一种基于网格近邻优化的密度峰值聚类算法(KG-DPC算法)。首先对数据空间进行网格化,减少了样本数据点之间距离的计算量;在计算局部密度时不仅考虑了网格自身的密度值,而且考虑了周围k个近邻的网格密度值,降低了主观选择截断距离对聚类结果的影响,提高了聚类准确率,设定网格密度阈值,保证了聚类结果的稳定性。通过实验结果表明,KG-DPC算法比DBSCAN、DPC和SDPC算法在聚类准确率上有很大提升,在聚类平均消耗时间上DPC、SNN-DPC和DPC-NN算法分别降低38%、44%和44%。在保证基本聚类准确率的基础上,KG-DPC算法在聚类效率上有特定优势。

关 键 词:密度峰值聚类  密度阈值  网格  近邻优化
收稿时间:2023/8/1 0:00:00
修稿时间:2024/3/15 0:00:00

Density peak clustering algorithm based on grid neighbor optimization
Affiliation:Xinjiang University of Finance &Economics,
Abstract:Density peak clustering(DPC) combines the local density and relative distance of data sample points, and can cluster arbitrary shaped datasets. However, DPC algorithm has problems such as subjective selection of truncation distance, simple allocation strategy, and high time complexity. This article proposed a density peak clustering algorithm based on grid nearest neighbor optimization(KG-DPC algorithm). Firstly, it gridded the data space to reduce the computational burden of distance between sample data points. When calculating local density, not only the density values of the grid itself were consi-dered, but also the density values of the surrounding k nearest neighbors were considered, reducing the impact of subjective selection of truncation distance on clustering results, improving clustering accuracy, and setting a grid density threshold to ensure the stability of clustering results. The experimental results show that the KG-DPC algorithm has a significant improvement in clustering accuracy compared to DBSCAN, DPC, and SDPC algorithms; Compared to DPC, SNN-DPC, and DPC-NN algorithms. The average consumption time in clustering is reduced by 38%, 44% and 44%, respectively. On the basis of ensuring the accuracy of basic clustering, the KG-DPC algorithm has specific advantages in clustering efficiency.
Keywords:density peak clustering  density threshold  grid  nearest neighbor optimization
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号