首页 | 本学科首页   官方微博 | 高级检索  
     

高维Turnstile型数据流聚类算法
引用本文:周晓云,张净,孙志挥. 高维Turnstile型数据流聚类算法[J]. 计算机科学, 2006, 33(11): 14-17
作者姓名:周晓云  张净  孙志挥
作者单位:东南大学计算机科学与工程系,南京,210096;东南大学计算机科学与工程系,南京,210096;江苏大学电气信息工程学院,镇江,212001
基金项目:国家自然科学基金;高等学校博士学科点专项科研项目;江苏省自然科学基金
摘    要:现有数据流聚类算法只能处理Time Series和Cash Register型数据流,并且应用于高维数据流时其精度不甚理想。提出针对高维Turnstile型数据流的子空间聚类算法HT-Stream,算法对数据空间进行网格划分,在线动态维护网格单元信息,采用倾斜时间窗口存储统计信息,根据用户指定时间跨度离线输出聚类结果。基于真实数据集与仿真数据集的实验表明,算法具有良好的适用性和有效性。

关 键 词:数据流  子空间聚类  高维  倾斜时间窗口

An Efficient Clustering Algorithm for High Dimensional Turnstile Data Streams
ZHOU Xiao-Yun,ZHANG Jing,SUN Zhi-Hui. An Efficient Clustering Algorithm for High Dimensional Turnstile Data Streams[J]. Computer Science, 2006, 33(11): 14-17
Authors:ZHOU Xiao-Yun  ZHANG Jing  SUN Zhi-Hui
Affiliation:Department of Computer Science and Engineering, Southeast University, Nanjing 210096; College of Electronic and Information Engineering, Jiangsu University, Zhenjiang 212001
Abstract:Previous method only can deal with Time Series and Cash Register data stream. Moreover, the efficiency of clustering high dimensional data stream is not very satisfactory. In this paper a novel algorithm for clustering Turnstile data stream named HT-Stream is presented. HT-Stream partitions the space into grids, summarizes statistical information over data stream according to the tilted time window, and finds the clusters offline. HT-Stream can resolve high dimensional clustering problem and discover clusters with arbitrary shape. The experimental results on real datasets and synthetic datasets demonstrate promising availabilities of the approach.
Keywords:Data stream  Subspace clustering   High dimension   Tilted time windows
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《计算机科学》浏览原始摘要信息
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号