首页 | 本学科首页   官方微博 | 高级检索  
     

自然邻居密度极值聚类算法
引用本文:张忠林,赵昱,闫光辉. 自然邻居密度极值聚类算法[J]. 计算机工程与应用, 2021, 57(23): 200-210. DOI: 10.3778/j.issn.1002-8331.2107-0529
作者姓名:张忠林  赵昱  闫光辉
作者单位:兰州交通大学 电子与信息工程学院,兰州 730000
摘    要:针对密度峰值聚类算法存在数据集密度差异较大时,低密度区域聚类中心难以检测和参数敏感的问题,提出了一种新型密度极值算法。引入自然邻居概念寻找数据对象自然近邻,定义椭圆模型计算自然稳定状态下数据局部密度;计算数据对象余弦相似性值,用余弦相似性值来更新数据对象连通值,采用连通值划分高低密度区域和离群点;构造密度极值函数找到高低密度不同区域聚类中心点;将不同区域非聚类中心点归并到离其最近的聚类中心所在簇中。通过在合成数据集和UCI公共数据集实验分析:该算法比其他对比算法在处理密度分布差异较大数据集上取得了更好的结果。

关 键 词:聚类  自然邻居  密度自适应距离  锚点  连通值  密度极值  

Natural Neighbor Density Extremum Clustering Algorithm
ZHANG Zhonglin,ZHAO Yu,YAN Guanghui. Natural Neighbor Density Extremum Clustering Algorithm[J]. Computer Engineering and Applications, 2021, 57(23): 200-210. DOI: 10.3778/j.issn.1002-8331.2107-0529
Authors:ZHANG Zhonglin  ZHAO Yu  YAN Guanghui
Affiliation:School of Electronics and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730000, China
Abstract:The density peak algorithm can find non-spherical clusters of any shape, but there are problems that the cluster centers of low-density regions are difficult to detect and the parameters are sensitive when the density of the data set is large, a new density extreme value algorithm is proposed. First, the concept of natural neighbors is introduced to find the natural neighbors of the data object, and the ellipse model is defined to calculate the local density of the data in the natural stable state. Second, it calculates the cosine similarity value of the data object, uses the cosine similarity value to update the connected value of the data object, and uses the connected value to divide the high and low density regions and outliers. Then, it uses the construction density extreme value function to find the different density regions. Finally, it merges the non-cluster center points of different regions into the cluster where the nearest cluster center is located. Through experimental analysis on the synthetic data set and UCI public data set, this algorithm has achieved better results than other comparison algorithms in processing data sets with large differences in density distribution.
Keywords:clustering  natural neighbor  density adaptive distance  anchor points  connected value  density extreme  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号