动态滑动窗口的数据流聚类方法 Approach for data streams clustering over dynamic sliding windows期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

动态滑动窗口的数据流聚类方法

引用本文：	张忠平,王浩,薛伟,夏炎.动态滑动窗口的数据流聚类方法[J].计算机工程与应用,2011,47(7):135-138.

作者姓名：	张忠平王浩薛伟夏炎

作者单位：	燕山大学信息科学与工程学院，河北秦皇岛 066004

基金项目：	国家自然科学基金，河北省教育厅科研计划项目

摘要：	数据流聚类是聚类分析中的重要问题。针对数据流的流速是变化的问题，在两阶段聚类框架基础上提出基于动态滑动窗口的数据流聚类算法。在线阶段，引入微聚类特征来存储数据流的概要信息，利用存储的概要信息动态调整滑动窗口规模，并计算数据点与微聚类中心的距离，以维护微聚类特征；离线阶段，对在线聚类阶段的聚类结果采用K-means算法进行宏聚类，生成最终聚类。实验结果表明，该算法具有较高的聚类质量和较好的伸缩性。
关键词：	数据挖掘数据流聚类滑动窗口
修稿时间：
Approach for data streams clustering over dynamic sliding windows

ZHANG Zhongping,WANG Hao,XUE Wei,XIA Yan.Approach for data streams clustering over dynamic sliding windows[J].Computer Engineering and Applications,2011,47(7):135-138.

Authors:	ZHANG Zhongping WANG Hao XUE Wei XIA Yan

Affiliation:	College of Information Science and Engineering，Yanshan University，Qinhuangdao，Hebei 066004，China

Abstract:	The clustering of data streams is an important problem for clustering analysis.In order to address the data streams with varying speed,an efficient data streams clustering algorithm over dynamic sliding windows is proposed,which based on the two-phased framework.In the online component,the novel micro-cluster feature is introduced to store the important statistical information of data streams.Through computing the distances from data points to the center of each micro-cluster,and adjusting the sizes of sliding windows,the corresponding clustering features are maintained dynamically.In the offline component,by employing the mean values of the micro-clusters in online component,k-means algorithm is adopted to generate the final clustering results.Experimental results show that this approach has higher clustering purity and better scalability.

Keywords:	data mining data streams clustering sliding windows
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《计算机工程与应用》浏览原始摘要信息
	点击此处可从《计算机工程与应用》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏