期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Aurora: a new model and architecture for data stream management 总被引：43，自引：0，他引：43

Daniel?J.?Abadi Email author Don?Carney Ugur??etintemel Mitch?Cherniack Christian?Convey Sangdon?Lee Michael?Stonebraker Nesime?Tatbul Stan?Zdonik 《The VLDB Journal The International Journal on Very Large Data Bases》2003,12(2):120-139

This paper describes the basic processing model and architecture of Aurora, a new system to manage data streams for monitoring applications. Monitoring applications differ substantially from conventional business data processing. The fact that a software system must process and react to continual inputs from many sources (e.g., sensors) rather than from human operators requires one to rethink the fundamental architecture of a DBMS for this application area. In this paper, we present Aurora, a new DBMS currently under construction at Brandeis University, Brown University, and M.I.T. We first provide an overview of the basic Aurora model and architecture and then describe in detail a stream-oriented set of operators.Received: 12 September 2002, Accepted: 26 March 2003, Published online: 21 July 2003Edited by Y. Ioannidis 相似文献

2.

基于DirectShow的多媒体流系统设计与应用 总被引：1，自引：0，他引：1

李艳辉李军《计算机工程与设计》2007,28(10):2379-2380,2383

阐述了DirectShow的基本原理,给出了基于DirectShow的应用系统开发的基本思想、基本方法.通过实例讨论了多媒体流的捕获和播放技术以及DirectShow过滤器和过滤器图管理器的构建和管理;讨论了利用Visual C 实现基于DirectShow的多媒体流系统开发的关键技术,并对多媒体系统实现的开发方法和编程过程做出说明,并给出核心代码.结果表明,在此基础上实现的多媒体应用系统,具有更好的可重用性和可扩展性,缩短了开发周期. 相似文献

3.

Comparing a knowledge-based and a data-driven method in querying data streams for system fault detection: A hydraulic drive system application

Ahmad Alzghoul Björn Backe Magnus Löfstrand Arne Byström Bengt Liljedahl 《Computers in Industry》2014

The field of fault detection and diagnosis has been the subject of considerable interest in industry. Fault detection may increase the availability of products, thereby improving their quality. Fault detection and diagnosis methods can be classified in three categories: data-driven, analytically based, and knowledge-based methods. 相似文献

4.

Data stream forecasting for system fault prediction

Ahmad Alzghoul Magnus Löfstrand Björn Backe 《Computers & Industrial Engineering》2012

Competition among today’s industrial companies is very high. Therefore, system availability plays an important role and is a critical point for most companies. Detecting failures at an early stage or foreseeing them before they occur is crucial for machinery availability. Data analysis is the most common method for machine health condition monitoring. In this paper we propose a fault-detection system based on data stream prediction, data stream mining, and data stream management system (DSMS). Companies that are able to predict and avoid the occurrence of failures have an advantage over their competitors. The literature has shown that data prediction can also reduce the consumption of communication resources in distributed data stream processing. 相似文献

5.

数据流管理技术 总被引：1，自引：1，他引：1

刘学军徐宏炳董逸生王永利钱江波《计算机科学》2005,32(4):6-10

最近,人们已经广泛认识到：在某些新的应用领域中,把数据看作瞬时的数据流比看作持久的关系更为适合。本文首先分析了传统数据库管理系统处理数据流的局限性,然后分析了三个典型的数据流管理系统的基本实现技术,讨论了当前数据流管理技术的研究现状和今后的研究方向,最后,给出了一个数据流管理原型系统的体系结构。相似文献

6.

A P4P-integrated data-driven P2P system for the live multimedia streaming service

Haesun Byun Meejeong Lee 《Computer Communications》2013,36(17-18):1698-1707

The Proactive network Provider Participation for the P2P (P4P) architecture deploys central servers, which perceives network status and provides peering suggestions to P2P systems in order to achieve better network resource utilization while supporting best possible application performance. However, P4P alone may not be able to make appropriate peering suggestions for live multimedia streaming since it does not include mechanisms to reflect some of the parameters that are important to the QoS of live multimedia streaming such as upload bandwidth and stability of a peer as a stream deliverer. Furthermore, peer synchronization and parent replacement in the middle of a session, which are critical issues to the QoS of live multimedia streaming, are also left as the matters to be dealt with by the P2P systems alone. Most of the existing data-driven P2P systems leverage periodic information exchanges among neighboring peers in order to cope with these problems, which may incur long delay and high control overhead. In this paper, we proposed P4P-integrated data-driven P2P system for live multimedia streaming service. The proposed system includes not only the peering suggestion mechanism appropriate for live multimedia streaming but also the peer synchronization and parent replacement mechanisms, which exploit the centralized P4P framework and do not require periodic control information exchanges. We implemented the system in NS-2 simulator and compared its performance to the P4P and existing data-driven P2P systems. The results from experiments show that the proposed system enhances QoS compared to the existing data-driven P2P systems while maintaining the same level of network efficiency of the original P4P. 相似文献

7.

Operator-aware approach for boosting performance in RDF stream processing

《Journal of Web Semantics》2017

To enable efficiency in stream processing, the evaluation of a query is usually performed over bounded parts of (potentially) unbounded streams, i.e., processing windows “slide” over the streams. To avoid inefficient re-evaluations of already evaluated parts of a stream in respect to a query, incremental evaluation strategies are applied, i.e., the query results are obtained incrementally from the result set of the preceding processing state without having to re-evaluate all input buffers. This method is highly efficient but it comes at the cost of having to maintain processing state, which is not trivial, and may defeat performance advantages of the incremental evaluation strategy. In the context of RDF streams the problem is further aggravated by the hard-to-predict evolution of the structure of RDF graphs over time and the application of sub-optimal implementation approaches, e.g., using relational technologies for storing data and processing states which incur significant performance drawbacks for graph-based query patterns. To address these performance problems, this paper proposes a set of novel operator-aware data structures coupled with incremental evaluation algorithms which outperform the counterparts of relational stream processing systems. This claim is demonstrated through extensive experimental results on both simulated and real datasets. 相似文献

8.

Hyo-Sang Lim Yang-Sae Moon 《Information Sciences》2008,178(6):1461-1478

We propose a new similar sequence matching method that efficiently supports variable-length and variable-tolerance continuous query sequences on time-series data stream. Earlier methods do not support variable lengths or variable tolerances adequately for continuous query sequences if there are too many query sequences registered to handle in main memory. To support variable-length query sequences, we use the window construction mechanism that divides long sequences into smaller windows for indexing and searching the sequences. To support variable-tolerance query sequences, we present a new notion of intervaled sequences whose individual entries are an interval of real numbers rather than a real number itself. We also propose a new similar sequence matching method based on these notions, and then, formally prove correctness of the method. In addition, we show that our method has the prematching characteristic, which finds future candidates of similar sequences in advance. Experimental results show that our method outperforms the naive one by 2.6-102.1 times and the existing methods in the literature by 1.4-9.8 times over the entire ranges of parameters tested when the query selectivities are low (<32%), which are practically useful in large database applications. 相似文献

9.

Enhancing P2P overlay network architecture for live multimedia streaming

Nen-Fu Huang Yih-Jou Tzang Hong-Yi Chang 《Information Sciences》2010,180(17):3210-4040

The number of live multimedia streaming applications is increasing, explaining the use of many overlay network topologies. Application-layer multicast (ALM) that it is a feasible alternative to multimedia stream has attracted considerable attention. However, a serious problem of ALM is that the multicast tree may be fragile, and peer failure causes tree partitions. This work presents a novel Hierarchical Ring Tree (HRT) architecture for Peer-to-Peer (P2P) live multimedia streaming. The proposed architecture combines ring-based and tree-based structures in a robust, scalable, reliable and resilient structure that can be used practically as an ALM topology. When peers enter or leave the system, the topology can be recovered rapidly such that live multimedia stream can be delivered smoothly with a low latency. The proposed HRT topology is maintained efficiently without splitting or merging trees. The performance of the proposed architecture and algorithms is evaluated experimentally. Experimental results indicate that the proposed topology can be used in a high-churn P2P network with a small delay. Simulation and experiment results reveal that the proposed architecture has a lower overhead than the ZIGZAG approach when handling peers’ joining or leaving, exhibits faster recovery, better quality-of-service during streaming, and a more robust topology, even with an extremely high number of peers joining/leaving. 相似文献

10.

Increasing availability of industrial systems through data stream mining

Ahmad Alzghoul Magnus Löfstrand 《Computers & Industrial Engineering》2011

Improving industrial product reliability, maintainability and thus availability is a challenging task for many industrial companies. In industry, there is a growing need to process data in real time, since the generated data volume exceeds the available storage capacity. This paper consists of a review of data stream mining and data stream management systems aimed at improving product availability. Further, a newly developed and validated grid-based classifier method is presented and compared to one-class support vector machine (OCSVM) and a polygon-based classifier. 相似文献

11.

Frequency-based load shedding over a data stream of tuples

Joong Hyuk Chang Hye-Chung Kum 《Information Sciences》2009,179(21):3733-2389

Usually the data generation rate of a data stream is unpredictable, and some data elements of the data stream cannot be processed in real time if the generation rate exceeds the capacity of a data stream processing algorithm. In order to overcome this situation gracefully, a load shedding technique is recommended. This paper proposes a frequency-based load shedding technique over a data stream of tuples. In many data stream processing applications, such as mining frequent patterns, data elements having high frequency can be considered more significant than others having low frequency. Based on this observation, in the proposed technique, only frequent elements of a data stream are processed in real time while the others are trimmed. The decision to shed a load from the data stream or not is controlled automatically by the data generation rate of a data stream. Consequently, an unnecessary load shedding operation is not allowed in the proposed technique. 相似文献

12.

用物联网架构建立人性化多媒体管理系统

梁柏榉谢运佳黄芳明方向阳蓝丽萍《物联网技术》2013,(6):68-70,73

物联网产业的发展是以应用为先导,主要致力于扩大应用规模和领域,挖掘新的应用潜力。文章阐述了物联网技术在多媒体管理系统中的应用,并从物联网的架构建立更加人性化的多媒体管理系统方面入手,设计了一个基于物联网技术的多媒体管理系统;同时,给出了重点设计的系统架构,分析了其中的技术难点,并提出了相应的解决方案。相似文献

13.

数据流多连续查询优化技术

赵宗敏王洋吴海涛《计算机应用》2009,29(Z2)

根据数据流连续达到、大小无界和实时性强的特点,引出数据流多连续查询的基本概念.针对多连续查询的特点和用户的需求,将多连续查询优化技术分为单流多查询和多流多查询.详细论述了单流过滤型多连续查询优化技术和基于共享的多流多连续查询优化技术,通过全面系统地分析每种优化算法的基本思想,得出每种查询技术的优缺点及适用场合. 相似文献

14.

实时多媒体流同步机制的研究 总被引：1，自引：0，他引：1

葛双全李芬《电脑与信息技术》2006,14(4):5-8

实时多媒体流同步是网络环境下流媒体应用的基本要求和主要难点之一。文章介绍了多媒体同步的基本概念，分析了在IP网络环境下实时多媒体流在各个环节失去同步的原因，并针对这些原因提出了相应的解决方法。相似文献

15.

Tree-based partition querying: a methodology for computing medoids in large spatial datasets

Kyriakos Mouratidis Dimitris Papadias Spiros Papadimitriou 《The VLDB Journal The International Journal on Very Large Data Bases》2008,17(4):923-945

Besides traditional domains (e.g., resource allocation, data mining applications), algorithms for medoid computation and related problems will play an important role in numerous emerging fields, such as location based services and sensor networks. Since the k-medoid problem is NP-hard, all existing work deals with approximate solutions on relatively small datasets. This paper aims at efficient methods for very large spatial databases, motivated by: (1) the high and ever increasing availability of spatial data, and (2) the need for novel query types and improved services. The proposed solutions exploit the intrinsic grouping properties of a data partition index in order to read only a small part of the dataset. Compared to previous approaches, we achieve results of comparable or better quality at a small fraction of the CPU and I/O costs (seconds as opposed to hours, and tens of node accesses instead of thousands). In addition, we study medoid-aggregate queries, where k is not known in advance, but we are asked to compute a medoid set that leads to an average distance close to a user-specified value. Similarly, medoid-optimization queries aim at minimizing both the number of medoids k and the average distance. We also consider the max version for the aforementioned problems, where the goal is to minimize the maximum (instead of the average) distance between any object and its closest medoid. Finally, we investigate bichromatic and weighted medoid versions for all query types, as well as, maximum capacity and dynamic medoids. 相似文献

16.

Query indexing with containment-encoded intervals for efficient stream processing 总被引：1，自引：0，他引：1

Kun-Lung Wu Shyh-Kwei Chen Philip S. Yu 《Knowledge and Information Systems》2006,9(1):62-90

Many continual range queries can be issued against data streams. To efficiently evaluate continual queries against a stream, a main memory-based query index with a small storage cost and a fast search time is needed, especially if the stream is rapid. In this paper, we study a CEI-based query index that meets both criteria for efficient processing of continual interval queries. This new query index is an indirect indexing approach. It centres around a set of predefined virtual containment-encoded intervals, or CEIs. The CEIs are used to first decompose query intervals and then perform efficient search operations. The CEIs are defined and labeled such that containment relationships among them are encoded in their IDs. The containment encoding makes decomposition and search operations efficient; from the encoding of the smallest CEI containing a data point, the encodings of other containing CEIs can be easily derived. Closed-form formulae for the bounds of the average index storage cost are derived. Simulations are conducted to evaluate the effectiveness of the CEI-based query index and to compare it with alternative approaches. The results show that the CEI-based query index significantly outperforms existing approaches in terms of both storage cost and search time. Kun-Lung Wu received the B.S. degree in electrical engineering from the National Taiwan University, Taipei, Taiwan, the M.S. and Ph.D. degrees in computer science from the University of Illinois at Urbana–Champaign. He is with the IBM Thomas J. Watson Research Center, currently a member of the Software Tools and Techniques Group. His current research interests include data streams, continual queries, mobile computing, Internet technologies and applications, database systems and distributed and parallel computing. He has published extensively and holds various patents in these areas. Dr. Wu is a Senior Member of the IEEE Computer Society and a member of the ACM. He was an Associate Editor for the IEEE Transactions on Knowledge and Data Engineering, 2000–2004. He was the general chair for the 3rd International Workshop on e-Commerce and Web-Based Information Systems (WECWIS 2001). He has served as an organising and program committee member on various conferences. He has received various IBM awards, including IBM Corporate Environmental Affair Excellence Award, Research Division Award and Invention Achievement Awards. He received a best paper award from IEEE EEE 2004. He is an IBM Master Inventor. Shyh-Kwei Chen received the B.S. degree in computer science and information engineering from National Taiwan University, Taipei, Taiwan, in 1983, the M.S. degree in computer science from the University of Minnesota, Minneapolis, in 1987, and the Ph.D. degree in computer science from University of Illinois at Urbana–Champaign, in 1994. Dr. Chen has been with the IBM Thomas J. Watson Research Center, Yorktown Heights, New York since October 1994, where he is currently a research staff member. His current research interests include XML, electronic commerce, business performance management, data engineering and compilers. He is a member of the ACM, the IEEE and the IEEE Computer Society. Philip S. Yu received the B.S. degree in electrical engineering from National Taiwan University, the M.S. and Ph.D. degrees in electrical engineering from Stanford University, and the M.B.A. degree from New York University. He is with the IBM Thomas J. Watson Research Center and is currently manager of the Software Tools and Techniques group. His research interests include data mining, Internet applications and technologies, database systems, multimedia systems, parallel and distributed processing and performance modelling. Dr. Yu has published more than 400 papers in refereed journals and conferences. He holds or has applied for more than 250 US patents. Dr. Yu is a Fellow of the ACM and a Fellow of the IEEE. He is an associate editor of ACM Transactions on Internet Technology. He is a member of the IEEE Data Engineering steering committee and is also on the steering committee of IEEE Conference on Data Mining. He was the Editor-in-Chief of IEEE Transactions on Knowledge and Data Engineering (2001–2004), an editor and advisory board member of IEEE Transactions on Knowledge and Data Engineering and also a guest coeditor of the special issue on mining of databases. He had also served as an associate editor of Knowledge and Information Systems. In addition to serving as program committee member on various conferences, he was the program cochair of the 11th International Conference on Data Engineering, the 6th Pacific Area Conference on Knowledge Discovery and Data Mining, and the 9th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, and the program chair of the 2nd International Workshop on Research Issues on Data Engineering: Transaction and Query Processing, the PAKDD Workshop on Knowledge Discovery from Advanced Databases and the 2nd International Workshop on Advanced Issues of E-Commerce and Web-based Information Systems. He served as the general chair of the 14th International Conference on Data Engineering and the general cochair of the 2nd IEEE International Conference on Data Mining. He has received several IBM honours, including two IBM Outstanding Innovation Awards, an Outstanding Technical Achievement Award, two Research Division Awards and the 81st Plateau of Invention Achievement Awards. He received an Outstanding Contributions Award from IEEE International Conference on Data Mining in 2003 and also an IEEE Region 1 Award for “promoting and perpetuating numerous new electrical engineering concepts” in 1999. Dr. Yu is an IBM Master Inventor and was recognised as one of the IBM's 10 top leading inventors in 1999. 相似文献

17.

The ARCHIMEDES Network System: a system for searching and accessing information in multiple multimedia sources

Jan L.G. Dietz Ruud van der Pol Floris Wiesman 《Journal of Intelligent Information Systems》1997,8(1):77-101

The amount of information available to information workers recently has becomeoverwhelming. This confronts information workers with two majorproblems: finding the information needed, and accessing it; they arecalled the search problem and the access problem, respectively. Asthe main result of our research an architecture is specified of anautomated tool that provides integrated support for searching andaccessing multimedia documents that may be located at arbitraryplaces. The architecture contains a database with information aboutthe documents and with thesaurus-like information. The architecturealso contains a browse mechanism and a query mechanism for inspectingthe database. In the design process of the architecture, severalfundamental questions arose, like “What is a document?”and “ What is a medium kind?”. The developed answers tosome of these questions are considered to have a general characterand thus to be useful also outside the scope of the research at hand.The paper concludes with an overview of the current status of theproject and a discussion of future work. 相似文献

18.

Optimizing data stream processing for large‐scale applications

下载免费PDF全文

Paolo Cappellari Mark Roantree Soon Ae Chun 《Software》2018,48(9):1607-1641

Stream processing systems are designed to analyze data arriving in real time and using continuous queries and respond when a specific event or sequence of events are detected. An important aspect of these systems is Streaming Analytics, which facilitates statistical calculations on continuous data within the stream. These systems must be designed to handle high volumes of data, be scalable, and accommodate a multitude of long‐lived concurrently running analytics. The challenges involved in the development of stream processing include on‐the‐fly transformation of data streams to match the query needs of users and the ability to model stream transformations to detect overlaps and possibilities for optimizations and to specify a methodology to deliver optimizations. In particular, this work focuses on exposing data stream application internals in order to detect reusable parts and then consolidate applications to optimize computational resource usage. The Streaming Data Analytics Model presented in this paper adopts a declarative approach that enables processing and manipulation of data streams in a simple manner while facilitating powerful optimizations necessary for managing high volumes of streaming data in real time. An evaluation is provided to demonstrate in both theoretical and quantitative aspects the high performance offered by our approach. 相似文献

19.

The CQL continuous query language: semantic foundations and query execution 总被引：2，自引：0，他引：2

Arvind Arasu Shivnath Babu Jennifer Widom 《The VLDB Journal The International Journal on Very Large Data Bases》2006,15(2):121-142

CQL, a continuous query language, is supported by the STREAM prototype data stream management system (DSMS) at Stanford. CQL is an expressive SQL-based declarative language for registering continuous queries against streams and stored relations. We begin by presenting an abstract semantics that relies only on “black-box” mappings among streams and relations. From these mappings we define a precise and general interpretation for continuous queries. CQL is an instantiation of our abstract semantics using SQL to map from relations to relations, window specifications derived from SQL-99 to map from streams to relations, and three new operators to map from relations to streams. Most of the CQL language is operational in the STREAM system. We present the structure of CQL's query execution plans as well as details of the most important components: operators, interoperator queues, synopses, and sharing of components among multiple operators and queries. Examples throughout the paper are drawn from the Linear Road benchmark recently proposed for DSMSs. We also curate a public repository of data stream applications that includes a wide variety of queries expressed in CQL. The relative ease of capturing these applications in CQL is one indicator that the language contains an appropriate set of constructs for data stream processing. Edited by M. Franklin 相似文献

20.

SNCStream+: Extending a high quality true anytime data stream clustering algorithm

《Information Systems》2016

Data Stream Clustering is an active area of research which requires efficient algorithms capable of finding and updating clusters incrementally as data arrives. On top of that, due to the inherent evolving nature of data streams, it is expected that algorithms undergo both concept drifts and evolutions, which must be taken into account by the clustering algorithm, allowing incremental clustering updates. In this paper we present the Social Network Clusterer Stream⁺ (SNCStream⁺). SNCStream⁺ tackles the data stream clustering problem as a network formation and evolution problem, where instances and micro-clusters form clusters based on homophily. Our proposal has its parameters analyzed and it is evaluated in a broad set of problems against literature baselines. Results show that SNCStream⁺ achieves superior clustering quality (CMM), and feasible processing time and memory space usage when compared to the original SNCStream and other proposals of the literature. 相似文献