首页 | 本学科首页   官方微博 | 高级检索  
     

基于边界点检测的变密度聚类算法
引用本文:陈延伟,赵兴旺.基于边界点检测的变密度聚类算法[J].计算机应用,2022,42(8):2450-2460.
作者姓名:陈延伟  赵兴旺
作者单位:山西大学 计算机与信息技术学院, 太原 030006
计算智能与中文信息处理教育部重点实验室(山西大学), 太原 030006
基金项目:国家自然科学基金资助项目(62072293)
摘    要:密度聚类算法因具有对噪声鲁棒、能够发现任意形状的类等优点,得到了广泛的应用。然而,在实际应用中,这种算法面临着由于数据集中不同类的密度分布不均,且类与类之间的边界难以区分等导致聚类效果较差的问题。为解决以上问题,提出一种基于边界点检测的变密度聚类算法(VDCBD)。首先,基于给出的相对密度度量方法识别变密度类之间的边界点,以此增强相邻类的可分性;其次,对非边界区域的点进行聚类以找到数据集的核心类结构;接着,依据高密度近邻分配原则将检测到的边界点分配到相应的核心类结构中;最后,基于类结构信息识别数据集中的噪声点。在人造数据集和UCI数据集上与K-means、基于密度的噪声应用空间聚类(DBSCAN)算法、密度峰值聚类算法(DPCA)、有效识别密度主干的聚类(CLUB)算法、边界剥离聚类(BP)算法进行了比较分析。实验结果表明,所提算法可以有效解决类分布密度不均、边界难以区分的问题,并在调整兰德指数(ARI)、标准化互信息(NMI)、F度量(FM)、准确度(ACC)评价指标上优于已有算法;在运行效率分析中,当数据规模较大时,VDCBD运行效率高于DPCA、CLUB和BP算法。

关 键 词:密度聚类  相对密度  变密度  边界点检测  噪声识别  
收稿时间:2021-06-24
修稿时间:2021-12-07

Varied density clustering algorithm based on border point detection
Yanwei CHEN,Xingwang ZHAO.Varied density clustering algorithm based on border point detection[J].journal of Computer Applications,2022,42(8):2450-2460.
Authors:Yanwei CHEN  Xingwang ZHAO
Affiliation:School of Computer and Information Technology,Shanxi University,Taiyuan Shanxi 030006,China
Key Laboratory Computational Intelligence and Chinese Information Processing of Ministry of Education (Shanxi University),Taiyuan Shanxi 030006,China
Abstract:The density clustering algorithm has been widely used because of its robustness to noise and the ability to find clusters of any shapes. However, in practical applications, this type of algorithms faces the problem of poor clustering effect due to the uneven distribution of the densities of different clusters in the dataset and the difficulty of distinguishing the borders between clusters. In order to solve the above problem, a Varied Density Clustering algorithm based on Border point Detection (VDCBD) was proposed. Firstly, the border points between varied density clusters were recognized based on the given relative density measurement method to enhance the separability of adjacent clusters. Secondly, the points in the non-border area were clustered to find the core class structures of the dataset. Secondly, the detected border points were allocated to the corresponding core class structures according to the principle of high-density neighbor allocation. Finally, the noise points in the dataset were recognized based on the class structure information. The proposed algorithm was compared and analyzed with the clustering algorithms such as K-means, Density-Based Spatial Clustering of Applications with Noise (DBSCAN)algorithm, Density Peaks Clustering Algorithm (DPCA), CLUstering based on Backbone (CLUB)algorithm, Border Peeling clustering (BP)algorithm on artificial datasets and UCI datasets. Experimental results show that the proposed algorithm can effectively solve the problems of uneven distribution of density and indistinguishable borders, and is superior to the existing algorithms on the evaluation indicators of Adjusted Rand Index (ARI), Normalized Mutual Information (NMI), F-Measure (FM), and Accuracy (ACC); in the analysis of operating efficiency, when the data size is relatively large, the operating efficiency of VDCBD is higher than those of DPCA, CLUB and BP algorithms.
Keywords:density clustering  relative density  varied density  border point detection  noise recognition  
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号