首页 | 本学科首页   官方微博 | 高级检索  
     

基于近邻传播与密度相融合的进化数据流聚类算法
引用本文:邢长征,刘剑.基于近邻传播与密度相融合的进化数据流聚类算法[J].计算机应用,2015,35(7):1927-1932.
作者姓名:邢长征  刘剑
作者单位:辽宁工程技术大学 研究生院, 辽宁 兴城 125105
基金项目:国家自然科学基金资助项目(61402212)。
摘    要:针对目前数据流离群点不能很好地被处理、数据流聚类效率较低以及对数据流的动态变化不能实时检测等问题,提出一种基于近邻传播与密度相融合的进化数据流聚类算法(I-APDenStream)。此算法使用传统的两阶段处理模型,即在线与离线聚类两部分。不仅引进了能够体现数据流动态变化的微簇衰减密度以及在线动态维护微簇的删减机制,而且在对模型采用扩展的加权近邻传播(WAP)聚类进行模型重建时,还引进了异常点检测删除机制。通过在两种类型数据集上的实验结果表明,所提算法的聚类准确率基本能保持在95%以上,其纯度对比实验等其他相关测试都有较好结果,能够高实效、高质量、高效率地处理数据流数据聚类。

关 键 词:离群点    数据流聚类    近邻传播    微簇
收稿时间:2015-01-15
修稿时间:2015-03-25

Evolutionary data stream clustering algorithm based on integration of affinity propagation and density
XING Changzheng,LIU Jian.Evolutionary data stream clustering algorithm based on integration of affinity propagation and density[J].journal of Computer Applications,2015,35(7):1927-1932.
Authors:XING Changzheng  LIU Jian
Affiliation:Graduate School, Liaoning Technical University, Xingcheng Liaoning 125105, China
Abstract:To solve the problems that the data stream outliers can not be disposed well, the efficiency of clustering data stream is low and the dynamic changes of data stream can not be real-time detected, an evolutionary data stream clustering algorithm based on integration of affinity propagation and density (I-APDenStream)was proposed. The traditional two-stage processing model was used in this algorithm, namely online and offline clustering. Not only the decay density of micro-cluster which could represent the dynamic changes of data stream and deletion mechanism for online dynamic maintenance of micro-cluster were introduced, but also the outliers' detection and simplification mechanism for model reconstruction by using the extended Weight Affinity Propagation (WAP) cluster was introduced. The experimental results on two types of data sets demonstrate that the cluster accuracy of the proposed algorithm remains at above 95%, and also achieves considerable improvements with respect to the purity compared to other algorithms. The proposed algorithm can cluster the data stream with high real-time, high quality and high efficiency.
Keywords:outlier                                                                                                                        data stream clustering                                                                                                                        Affinity Propagation (AP)                                                                                                                        micro-cluster
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号