首页 | 本学科首页   官方微博 | 高级检索  
     


Sketching asynchronous data streams over sliding windows
Authors:Bojian Xu  Srikanta Tirthapura  Costas Busch
Affiliation:(1) Department of Electrical and Computer Engineering, Iowa State University, 2215, Coover Hall, Ames, IA 50011, USA;(2) Department of Computer Science, Louisiana State University, 280 Coates Hall, Baton Rouge, LA 70816, USA
Abstract:We study the problem of maintaining a sketch of recent elements of a data stream. Motivated by applications involving network data, we consider streams that are asynchronous, in which the observed order of data is not the same as the time order in which the data was generated. The notion of recent elements of a stream is modeled by the sliding timestamp window, which is the set of elements with timestamps that are close to the current time. We design algorithms for maintaining sketches of all elements within the sliding timestamp window that can give provably accurate estimates of two basic aggregates, the sum and the median, of a stream of numbers. The space taken by the sketches, the time needed for querying the sketch, and the time for inserting new elements into the sketch are all polylogarithmic with respect to the maximum window size. Our sketches can be easily combined in a lossless and compact way, making them useful for distributed computations over data streams. Previous works on sketching recent elements of a data stream have all considered the more restrictive scenario of synchronous streams, where the observed order of data is the same as the time order in which the data was generated. Our notion of recency of elements is more general than that studied in previous work, and thus our sketches are more robust to network delays and asynchrony. The work of the authors was supported in part through NSF grants CNS 0520102 and CNS 0520009. A preliminary version of this paper appeared in Proceedings of the ACM Symposium on Principles of Distributed Computing (PODC) 2006, pages 82–91. Work done while the third author was at Rensselaer Polytechnic Institute. Authors are listed in reverse alphabetical order.
Keywords:Data streams  Asynchronous streams  Distributed streams  Sliding window  Sum  Median
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号