首页 | 本学科首页   官方微博 | 高级检索  
     

一种不确定数据流聚类算法
引用本文:张晨,金澈清,周傲英.一种不确定数据流聚类算法[J].软件学报,2010,21(9):2173-2182.
作者姓名:张晨  金澈清  周傲英
作者单位:1. 复旦大学,计算机科学技术学院,上海市智能信息处理重点实验室,上海,200433
2. 华东师范大学,软件学院,上海市高可信计算重点实验室,上海,200062
基金项目:Supported by the National Natural Science Foundation of China under Grant Nos.60933001, 60803020 (国家自然科学基金); the National Science Foundation for Distinguished Young Scholars of China under Grant No.60925008 (国家杰出青年基金项目); the Shanghai Leading Academic Disc
摘    要:提出了EMicro算法,以解决不确定数据流上的聚类问题.与现有技术大多仅考虑元组间的距离不同,EMicro算法综合考虑了元组之间的距离与元组自身不确定性这两个因素,同时定义新标准来描述聚类结果质量.还提出了离群点处理机制,系统同时维护两个缓冲区,分别存放正常的微簇与潜在的离群点微簇,以期得到理想的性能.实验结果表明,与现有工作相比,EMicro的效率更高,且效果良好.

关 键 词:不确定数据流  聚类  离群点
收稿时间:2008/11/17 0:00:00
修稿时间:2009/4/29 0:00:00

Clustering Algorithm over Uncertain Data Streams
ZHANG Chen,JIN Che-Qing and ZHOU Ao-Ying.Clustering Algorithm over Uncertain Data Streams[J].Journal of Software,2010,21(9):2173-2182.
Authors:ZHANG Chen  JIN Che-Qing and ZHOU Ao-Ying
Abstract:This paper proposes a novel algorithm, named EMicro, to cluster uncertain data streams. Although most of the works used today mainly use the distance metric to describe the cluster quality, EMicro considers distance metric and data uncertainty together to measure the clustering quality. Another contribution of this paper is the outlier processing mechanism. Two buffers are maintained to reserve normal micro-clusters and potential outlier micro-clusters, respectively, to obtain good performance. Experimental results show that EMicro outperforms existing methods in efficiency and effectiveness.
Keywords:uncertain data stream  clustering  outlier
本文献已被 万方数据 等数据库收录!
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号