首页 | 本学科首页   官方微博 | 高级检索  
     

基于消息传递的谱聚类算法
引用本文:王丽娟,丁世飞,贾洪杰.基于消息传递的谱聚类算法[J].数据采集与处理,2019,34(3):548-557.
作者姓名:王丽娟  丁世飞  贾洪杰
作者单位:1.中国矿业大学计算机科学与技术学院,徐州,221116;2.徐州工业职业技术学院信息与电气工程学院,徐州,221400;3.江苏大学计算机与通信工程学院,镇江,212013
基金项目:国家自然科学基金61676522,61379101;徐州市科技发展基金KC17132国家自然科学基金(61676522,61379101)资助项目;徐州市科技发展基金(KC17132)资助项目。
摘    要:谱聚类将数据聚类问题转化成图划分问题,通过寻找最优的子图,对数据点进行聚类。谱聚类的关键是构造合适的相似矩阵,将数据集的内在结构真实地描述出来。针对传统的谱聚类算法采用高斯核函数来构造相似矩阵时对尺度参数的选择很敏感,而且在聚类阶段需要随机确定初始的聚类中心,聚类性能也不稳定等问题,本文提出了基于消息传递的谱聚类算法。该算法采用密度自适应的相似性度量方法,可以更好地描述数据点之间的关系,然后利用近邻传播(Affinity propagation,AP)聚类中“消息传递”机制获得高质量的聚类中心,提高了谱聚类算法的性能。实验表明,新算法可以有效地处理多尺度数据集的聚类问题,其聚类性能非常稳定,聚类质量也优于传统的谱聚类算法和k-means算法。

关 键 词:谱聚类  相似矩阵  消息传递  聚类稳定性
收稿时间:2018/1/8 0:00:00
修稿时间:2019/4/9 0:00:00

Spectral Clustering Algorithm Based on Message Passing
Wang Lijuan,Ding Shifei,Jia Hongjie.Spectral Clustering Algorithm Based on Message Passing[J].Journal of Data Acquisition & Processing,2019,34(3):548-557.
Authors:Wang Lijuan  Ding Shifei  Jia Hongjie
Affiliation:1.School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, 221116, China;2.School of Information and Electrical Engineering, Xuzhou College of Industrial Technology, Xuzhou, 221400,China;3.School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang, 212013, China
Abstract:Spectral clustering transforms data clustering problem into a graph partitioning problem and classifies data points by finding the optimal sub-graphs. The key to spectral clustering is constructing a suitable similarity matrix, which can truly describe the intrinsic structure of the dataset. However, traditional spectral clustering algorithms adopt Gaussian kernel function to construct the similarity matrix, which results in their sensitivity of selection for scale parameter. In addition, the initial cluster centers need randomly determing at the clustering stage and the clustering performance is not stable. The paper presents an algorithm based on message passing. The algorithm uses a density adaptive similarity measure, which can well describe the relations between data points, and it can obtain high-quality cluster centers through message passing mechanism in affinity propagation (AP) clustering. Moreover, the performance of clustering is optimized by the method. Experiments show that the proposed algorithm can effectively deal with the clustering problem of multi-scale datasets. Its clustering performance is very stable, and the clustering quality is better than traditional spectral clustering algorithm and k-means algorithm.
Keywords:spectral clustering  similarity matrix  message passing  clustering stability
点击此处可从《数据采集与处理》浏览原始摘要信息
点击此处可从《数据采集与处理》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号