首页 | 本学科首页   官方微博 | 高级检索  
     

基于加权K近邻的改进密度峰值聚类算法
引用本文:杨震,王红军.基于加权K近邻的改进密度峰值聚类算法[J].计算机应用研究,2020,37(3):667-671.
作者姓名:杨震  王红军
作者单位:国防科技大学,合肥230037;国防科技大学,合肥230037
基金项目:国家自然科学基金资助项目
摘    要:密度峰值聚类算法是一种新颖的密度聚类算法,但是原算法仅仅考虑了数据的全局结构,在对分布不均匀的数据集进行聚类时效果不理想,并且原算法仅仅依据决策图上各点的分布情况来选取聚类中心,缺乏可靠的选取标准。针对上述问题,提出了一种基于加权K近邻的改进密度峰值聚类算法,将最近邻算法的思想引入密度峰值聚类算法,重新定义并计算了各数据点的局部密度,并通过权值斜率变化趋势来判别聚类中心临界点。通过在人工数据集上与UCI真实数据集上的实验,将该改进算法与原密度峰值聚类、K-means及DBSCAN算法进行了对比,证明了改进算法能够在密度不均匀数据集上有效完成聚类,能够发现任意形状簇,且在三个聚类性能指标上普遍高于另外三种算法。

关 键 词:数据挖掘  加权K近邻  密度峰值  聚类
收稿时间:2018/8/31 0:00:00
修稿时间:2018/10/25 0:00:00

Improved density peak clustering algorithm based on weighted K-nearest neighbor
Yang Zhen and Wang Hongjun.Improved density peak clustering algorithm based on weighted K-nearest neighbor[J].Application Research of Computers,2020,37(3):667-671.
Authors:Yang Zhen and Wang Hongjun
Affiliation:National University of Defense Technology,
Abstract:The density peak clustering algorithm was a new density-based clustering algorithm, the algorithm requires only one input parameter and does not require frequent iterative processes. However, the original algorithm only considers the global structure of the data, and the effect is not ideal when clustering data sets with uneven distribution. Moreover, the original algorithm only selects the cluster center according to the distribution of points on the decision graph, which is not reliable. Aiming at the above problems, this paper proposed an improved density peak clustering algorithm based on weighted K-nearest neighbor. It introduced the idea of nearest neighbor algorithm into the density peak clustering algorithm, refined and calculated the local density of each data point, and determined the critical point of the cluster center by the trend of the slope of the weight. The improved algorithm was compared with the original density peak clustering algorithm, K-means algorithm and DBSCAN algorithm by experiments on the artificial dataset and UCI real dataset. It was proved that the improved algorithm can deal with the density uneven dataset and find clusters of arbitrary shapes. On the three cluster performance indicators, the improved algorithm is generally higher than the other three algorithms.
Keywords:data mining  weighted K-nearest neighbor  density peaks  clustering
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号