首页 | 本学科首页   官方微博 | 高级检索  
     

不确定数据信任密度峰值聚类算法
引用本文:汪康,马宗方,田鸿朋,宋琳.不确定数据信任密度峰值聚类算法[J].信息与控制,2022,51(3):349-360.
作者姓名:汪康  马宗方  田鸿朋  宋琳
作者单位:西安建筑科技大学信息与控制工程学院, 陕西 西安 710055
基金项目:国家重点研发计划(2019YFC1907105);陕西省重点研发计划(2020GY-186,2020SF-367)
摘    要:密度峰值聚类算法具有简单高效、无需迭代计算和提前设定类簇数的优势,但是在划分非类中心样本时容易产生“多米诺骨牌”效应,并且不能准确划分重叠区域的样本和噪声。为了解决以上问题,提出了不确定数据信任密度峰值聚类算法。首先,该算法在密度峰值聚类算法获取类中心样本的基础上,利用非类中心样本的K近邻求出样本属于不同类的信任值,将样本划分到信任值最大的类别,得到基于K近邻的初步聚类结果。然后,计算关于密度的上分位数得到密度阈值,在证据推理框架下进行信任划分,将密度小于该阈值的孤立样本划分到噪声类;处于重叠部分的样本划分到相关单类组成的复合类;信任值强烈支持属于某个类别的样本划分到相应的单类。该算法通过引入复合类和噪声类能够更加准确地展现样本在现有属性信息下的不确定性。实验结果表明,该算法在人工数据集和UCI数据集上相比于其他对比算法,能够取得更好的聚类性能。

关 键 词:聚类  密度峰值  K近邻  证据推理  信任划分  
收稿时间:2021-05-24

Belief Density Peak Clustering Algorithm for Uncertain Data
WANG Kang,MA Zongfang,TIAN Hongpeng,SONG Lin.Belief Density Peak Clustering Algorithm for Uncertain Data[J].Information and Control,2022,51(3):349-360.
Authors:WANG Kang  MA Zongfang  TIAN Hongpeng  SONG Lin
Affiliation:Xi'an University of Architecture and Technology, College of Information and Control Engineering, Xi'an 710055, China
Abstract:The density peak clustering algorithm is simple and efficient and does not require iterative calculations. It has the advantages of setting the number of clusters in advance, but it is easy to produce a “domino”effect when dividing non-centered samples. Moreover, it cannot accurately partition the samples and noise in the overlapping area. To solve the above problems, the belief density peak clustering algorithm for uncertain data is proposed. First, the algorithm uses the K-nearest neighbors of non-class center samples to determine the degree of belief of the samples belonging to different clusters based on the density peak clustering algorithm so as to obtain the cluster center samples and partition the samples into a meta-cluster with the largest degree of belief to obtain the preliminary clustering results of K-nearest neighbors. Then, the upper quantile of the density is calculated to obtain the density threshold and credal partition under the framework of evidence reasoning, and isolated samples whose density is less than the threshold are classified into the noise cluster. Afterward, the samples in the overlapping part are partitioned into the composite cluster composed of related single clusters. The degree of belief strongly supports the classification of samples belonging to a certain cluster into the corresponding single cluster. The algorithm introduces the composite cluster and noise cluster to accurately show the uncertainty of the sample under the existing attribute information. Experimental results show that this algorithm can achieve better clustering performance compared with other algorithms on artificial and UCI datasets.
Keywords:clustering  density peak  K-nearest neighbors (KNN)  evidential reasoning  credal partition  
点击此处可从《信息与控制》浏览原始摘要信息
点击此处可从《信息与控制》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号