首页 | 本学科首页   官方微博 | 高级检索  
     

基于序列聚类的事件流数据特征分析
引用本文:王勇,王洁,王明华,焦丽梅.基于序列聚类的事件流数据特征分析[J].计算机工程,2008,34(12):34-36.
作者姓名:王勇  王洁  王明华  焦丽梅
作者单位:1. 国家智能计算机研究开发中心中国科学院计算技术研究所,北京,100080;中国科学院研究生院,北京,100039
2. 首都师范大学信息工程学院,北京,100037
3. 国家计算机网络应急技术处理协调中心,北京,100029
4. 国家智能计算机研究开发中心中国科学院计算技术研究所,北京,100080
基金项目:国家发改委CNGI基金
摘    要:事件流是近年来兴起的一种对实时进入系统的海量数据进行分析查询的应用,而数据特征是评价系统所需要的负载模型的重要部分。以网络安全监控为背景,提出一种将事件流聚集成时间序列并进行相似性聚类来分析数据特征的方法。通过适当的粒度聚合,将事件流转化成时间序列,选取周期性的时间序列作为代表消除随机干扰,给出基于序列线性相似性的聚类算法。聚类试验表明,具有相似时间特征的事件流可以被有效地聚集到同一类中。

关 键 词:数据特征  时间序列  聚类  事件流
文章编号:1000-3428(2008)12-0034-03
修稿时间:2007年10月11

Characteristics Analysis of Event Stream Data Based on Sequence Clustering
WANG Yong,WANG Jie,WANG Ming-hua,JIAO Li-mei.Characteristics Analysis of Event Stream Data Based on Sequence Clustering[J].Computer Engineering,2008,34(12):34-36.
Authors:WANG Yong  WANG Jie  WANG Ming-hua  JIAO Li-mei
Affiliation:(1. National Research Center for Intelligent Computing System, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100080; 2. Graduate University of Chinese Academy of Sciences, Beijing 100039; 3. Information Engineering College, Capital Normal University, Beijing 100037; 4. National Computer Network Emergency Response Technical Team/Coordination Center of China, Beijing 100029)
Abstract:Event stream is a new kind of analysis application on massive data which enter the system in real-time and data characteristics are important components of workload modeling to evaluate specific system. With background on network security monitoring, it presents an approach of aggregating event stream into time series and charactering data using similarity clustering. Event streams are converted into time series by aggregation of moderate granularity of time, and, the seasonal component of time series is chosen as the representation of original series to avoid random noise. Clustering algorithm of similarity under the transformation of scaling and shifting is presented. Experiment on real data shows that event streams with similar temporal characteristics are clustered into the same cluster efficiently.
Keywords:data characteristics  time series  clustering  event stream
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《计算机工程》浏览原始摘要信息
点击此处可从《计算机工程》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号