首页 | 本学科首页   官方微博 | 高级检索  
     

考虑边界样本邻域归属信息的粗糙K-means增量聚类算法
引用本文:马福民,孙静勇,张腾飞.考虑边界样本邻域归属信息的粗糙K-means增量聚类算法[J].控制与决策,2022,37(11):2968-2976.
作者姓名:马福民  孙静勇  张腾飞
作者单位:南京财经大学 信息工程学院,南京 210023;南京邮电大学 自动化学院、人工智能学院,南京 210023
基金项目:国家自然科学基金项目(61973151,62073173);江苏省自然科学基金项目(BK20191406,BK20191376).
摘    要:在原有数据聚类结果的基础上,如何对新增数据进行归属度量分析是提高增量式聚类质量的关键,现有增量式聚类算法更多地是考虑新增数据的位置分布,忽略其邻域数据点的归属信息.在粗糙K-means聚类算法的基础上,针对边界区域新增数据点的不确定性信息处理,提出一种基于邻域归属信息的粗糙K-means增量式聚类算法.该算法综合考虑边界区域新增数据样本的位置分布及其邻域数据点的类簇归属信息,使得新增数据点与各类簇的归属度量更为合理;此外,在增量式聚类过程中,根据新增数据点所导致的类簇结构的变化,对类簇进行相应的合并或分裂操作,使类簇划分可以自适应调整.在人工数据集和UCI标准数据集上的对比实验结果验证了算法的有效性.

关 键 词:粗糙K-means聚类  增量聚类  邻域归属信息  类簇结构

Rough K-means incremental clustering algorithm considering neighborhood belonging information of boundary samples
MA Fu-min,SUN Jing-yong,ZHANG Teng-fei.Rough K-means incremental clustering algorithm considering neighborhood belonging information of boundary samples[J].Control and Decision,2022,37(11):2968-2976.
Authors:MA Fu-min  SUN Jing-yong  ZHANG Teng-fei
Affiliation:College of Information Engineering,Nanjing University of Finance and Economics,Nanjing 210023,China; College of Automation & College of Artificial Intelligence,Nanjing University of Posts and Telecommunications,Nanjing 210023,China
Abstract:The key to improve the quality of incremental clustering is how to assign the new data to different clusters on the basis of original data clustering results. The existing incremental clustering algorithms mostly consider the location distribution of the newly added data point, and ignore the belonging information of the neighbor points around the new data point. To deal with the uncertain information of new data points that fall into boundary regions of original clusters, based on the rough K-means clustering, a rough K-means incremental clustering algorithm is developed. In this algorithm, focusing on the assignment of the newly added data in the boundary region, the neighborhood belonging information of the new data is taken into consideration, so that the hybrid measure of the new data point belonging to different clusters is more reasonable. Furthermore, the clusters will be merged or split to make the new divided clusters becoming more reasonable according to the cluster structure changes caused by the new data. The validity of the proposed algorithm is demonstrated by the experimental results on the artificial data sets and UCI standard data sets.
Keywords:
点击此处可从《控制与决策》浏览原始摘要信息
点击此处可从《控制与决策》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号