首页 | 本学科首页   官方微博 | 高级检索  
     

基于非参数核密度估计的密度峰值聚类算法*
引用本文:谢国伟,钱雪忠,周世兵.基于非参数核密度估计的密度峰值聚类算法*[J].计算机应用研究,2018,35(10).
作者姓名:谢国伟  钱雪忠  周世兵
作者单位:江南大学 物联网工程学院 智能系统与网络计算研究所,江南大学 物联网工程学院 智能系统与网络计算研究所,江南大学 物联网工程学院 智能系统与网络计算研究所
基金项目:国家自然科学基金资助(61673193);中央高校基本科研业务费专项资金资助(JUSRP11235, JUSRP51635B)资助
摘    要:针对密度峰值聚类算法CFSFDP(Clustering by fast search and find of density peaks)计算密度时人为判断截断距离和人工截取簇类中心的缺陷,提出了一种基于非参数核密度估计的密度峰值的聚类算法。首先,应用非参数核密度估计方法计算数据点的局部密度;其次,根据排序图采用簇中心点自动选择策略确定潜在簇类中心点,将其余数据点归并到相应的簇类中心;最后,依据簇类间的合并准则,对邻近相似子簇进行合并,并根据边界密度识别噪声点,得到聚类结果。在人工测试数据集和UCI真实数据集上的实验表明,新算法较之原CFSFDP算法,不仅有效避免了人为判断截断距离和截取簇类中心的主观因素,而且可以取得更高的准确度。

关 键 词:聚类  密度峰值  非参数核密度估计  截断距离
收稿时间:2017/5/18 0:00:00
修稿时间:2018/8/30 0:00:00

Density peak clustering algorithm based on non-parametric kernel density estimation
XIE Guo-wei,QIAN Xue-zhong and ZHOU Shi-bing.Density peak clustering algorithm based on non-parametric kernel density estimation[J].Application Research of Computers,2018,35(10).
Authors:XIE Guo-wei  QIAN Xue-zhong and ZHOU Shi-bing
Affiliation:Institute of Intelligent Systems and Network Computing,School of Internet of Things Engineering,Jiangnan University,,
Abstract:In view of the problem that the density peak clustering algorithm CFSFDP (Clustering by fast search and find of density peaks) cannot determine the cut off distance and the interception of the cluster center, this paper put forward an improved algorithm based on non-parametric kernel density estimation. Firstly, this paper used the non-parametric kernel density estimation method to calculate the local density of the data points. Secondly, according to the sort map, it used the cluster center point automatic selection strategy to determine the cluster center point, then merged the rest of the data points to the corresponding cluster center. Finally, this paper merged the neighboring similar clusters based on the merging criterion between the clusters, and identified the noise points according to the boundary density. Experiments on the manual test data set and the UCI real data set show that the new algorithm can not only determine the cluster center automatically, but also get higher accuracy than the original CFSFDP algorithm.
Keywords:clustering  density peaks  non-parametric kernel density estimation  cut off distance
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号