首页 | 本学科首页   官方微博 | 高级检索  
     

基于密度与近邻传播的数据流聚类算法
引用本文:张建朋,陈福才,李邵梅,刘力雄.基于密度与近邻传播的数据流聚类算法[J].自动化学报,2014,40(2):277-288.
作者姓名:张建朋  陈福才  李邵梅  刘力雄
作者单位:1.国家数字交换系统工程技术研究中心 郑州 450002
基金项目:国家高技术研究发展计划(863计划)(2011AA010603,2011AA010605)资助
摘    要:针对现有算法聚类精度不高、处理离群点能力较差以及不能实时检测数据流变化的缺陷,提出一种基于密度与近邻传播融合的数据流聚类算法.该算法采用在线/离线两阶段处理框架,通过引 入微簇衰减密度来精确反映数据流的演化信息,并采用在线动态维护和删减微簇机制,使算法模型更 符合原始数据流的内在特性.同时,当模型中检测到新的类模式出现时,采用一种改进的加权近邻传播聚类(Weighted and hierarchical affinity propagation,WAP)算法对模 型进行重建,因而能够实时检测到数据流的变化,并能给出任意时间的聚类结果.在真实数据集和人工 数据集上的实验表明,该算法具有良好的适用性、有效性和可扩展性,能够取得较好的聚类效果.

关 键 词:数据流挖掘    近邻传播    基于密度聚类    变化检测
收稿时间:2013-01-16

Data Stream Clustering Algorithm Based on Density and Affinity Propagation Techniques
ZHANG Jian-Pen,CHEN Fu-Cai,LI Shao-Mei,LIU Li-Xiong.Data Stream Clustering Algorithm Based on Density and Affinity Propagation Techniques[J].Acta Automatica Sinica,2014,40(2):277-288.
Authors:ZHANG Jian-Pen  CHEN Fu-Cai  LI Shao-Mei  LIU Li-Xiong
Affiliation:1.National Digital Switching System Engineering and Technological Research and Development Center, Zhengzhou 450002
Abstract:For the accuracy of the existing clustering algorithm is not high, and the ability of dealing with outliers is poor and unable to detect the real-time changes of data stream, a data stream clustering algorithm based on density and affinity propagation is proposed. The algorithm adopts an online/offline two-stage processing framework and it introduces the micro-cluster decay density to reflect the evolution of the data stream accurately. In the meantime, it uses the mechanism of online dynamic maintenance and deletion of the micro-cluster, which makes the algorithm's model more consistent with the intrinsic characteristics of the original data streams. Simultaneously, it also takes an improved WAP (weighted and hierarchical affinity propagation) algorithm to reconstruct the models when detecting a new emerging class model. Thus it can detect the changes of the data stream in real time, and give the clustering results at any time. Experiments on real data sets and artificial data sets show that the algorithm has good applicability, efficiency, and scalability, thus it can achieve better clustering results.
Keywords:Data stream mining  affinity propagation  density-based clustering  change detection method
本文献已被 CNKI 等数据库收录!
点击此处可从《自动化学报》浏览原始摘要信息
点击此处可从《自动化学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号