An important feature of database technology of the nineties is the use of parallelism for speeding up the execution of complex queries. This technology is being tested in several experimental database architectures and a few commercial systems for conventional select-project-join queries. In particular, hash-based fragmentation is used to distribute data to disks under the control of different processors in order to perform selections and joins in parallel. With the development of new query languages, and in particular with the definition of transitive closure queries and of more general logic programming queries, the new dimension of recursion has been added to query processing. Recursive queries are complex; at the same time, their regular structure is particularly suited for parallel execution, and parallelism may give a high efficiency gain. We survey the approaches to parallel execution of recursive queries that have been presented in the recent literature. We observe that research on parallel execution of recursive queries is separated into two distinct subareas, one focused on the transitive closure of Relational Algebra expressions, the other one focused on optimization of more general Datalog queries. Though the subareas seem radically different because of the approach and formalism used, they have many common features. This is not surprising, because most typical Datalog queries can be solved by means of the transitive closure of simple algebraic expressions. We first analyze the relationship between the transitive closure of expressions in Relational Algebra and Datalog programs. We then review sequential methods for evaluating transitive closure, distinguishing iterative and direct methods. We address the parallelization of these methods, by discussing various forms of parallelization. Data fragmentation plays an important role in obtaining parallel execution; we describe hash-based and semantic fragmentation. Finally, we consider Datalog queries, and present general methods for parallel rule execution; we recognize the similarities between these methods and the methods reviewed previously, when the former are applied to linear Datalog queries. We also provide a quantitative analysis that shows the impact of the initial data distribution on the performance of methods.
Recommended by: Patrick Valduriez 相似文献
We introduce a semantic data model to capture the hierarchical, spatial, temporal, and evolutionary semantics of images in pictorial databases. This model mimics the user's conceptual view of the image content, providing the framework and guidelines for preprocessing to extract image features. Based on the model constructs, a spatial evolutionary query language (SEQL), which provides direct image object manipulation capabilities, is presented. With semantic information captured in the model, spatial evolutionary queries are answered efficiently. Using an object-oriented platform, a prototype medical-image management system was implemented at UCLA to demonstrate the feasibility of the proposed approach. 相似文献
We present methods to generate rendering sequences for triangle meshes which preserve mesh locality as much as possible. This is useful for maximizing vertex reuse when rendering the mesh using a FIFO vertex buffer, such as those available in modern 3D graphics hardware. The sequences are universal in the sense that they perform well for all sizes of vertex buffers, and generalize to progressive meshes. This has been verified experimentally. 相似文献
We propose a general modeling framework to evaluate the performance of cache consistency algorithms. In addition to the usual hit rate, we introduce the hit* rate as a consistency measure, which captures the fraction of non-stale downloads from the cache. We apply these ideas to the analysis of the fixed TTL consistency algorithm in the presence of network delays. The hit and hit* rates are evaluated when requests and updates are modeled by renewal processes. Classical results on the renewal function lead to various bounds. 相似文献
LDL is one of the recently proposed logical query languages, which incorporate set, for data and knowledge base systems. Since LDL programs can simulate negation, they are not monotonic in general. On the other hand, there are monotonic LDL programs. This paper addresses the natural question of “When are the generally nonmonotonic LDL programs monotonic?” and investigates related topics such as useful applications for monotonicity. We discuss four kinds of monotonicity, and examine two of them in depth. The first of the two, called “ω-monotonicity”, is shown to be undecidable even when limited to single-stratum programs. The second, called “uniform monotonicity”, is shown to implyω-monotonicity. We characterize the uniform monotonicity of a program (i) by a relationship between its Bancilhon-Khoshafian semantics and its LDL semantics, and (ii) with a useful property called subset completion independence. Characterization (ii) implies that uniformly monotonie programs can be evaluated more efficiently by discarding dominated facts. Finally, we provide some necessary and/or sufficient, syntactic conditions for uniform monotonicity. The conditions pinpoint (a) enumerated set terms, (b) negations of membership and inclusion, and (c) sharing of set terms as the main source for nonuniform monotonicity. 相似文献
The development of future mobile networks will be driven, in large part, by content and web based services. In this paper,
we examine several performance, scalability and architectural challenges faced by future mobile web applications and how advanced
mobile content delivery techniques can address these challenges. We review existing content delivery using a taxonomy that
consists of three categories: network scaling, end system acceleration, and content and protocol optimization. While wireline
content delivery focuses on network and server scalability, mobile content delivery will likely benefit most from optimizing
radio link usage. We also present our ongoing work in this area, which extends the functionality of edge caching to the terminal,
uses user interest correlation information to maintain low terminal power consumption and adds a new dimension to radio resource
management.
This revised version was published online in August 2006 with corrections to the Cover Date. 相似文献
This paper concerns the following problem: given a set of multi-attribute records, a fixed number of buckets and a two-disk system, arrange the records into the buckets and then store the buckets between the disks in such a way that, over all possible orthogonal range queries (ORQs), the disk access concurrency is maximized. We shall adopt the multiple key hashing (MKH) method for arranging records into buckets and use the disk modulo (DM) allocation method for storing buckets onto disks. Since the DM allocation method has been shown to be superior to any other allocation methods for allocating an MKH file onto a two-disk system for answering ORQs, the real issue is knowing how to determine an optimal way for organizing the records into buckets based upon the MKH concept.
A performance formula that can be used to evaluate the average response time, over all possible ORQs, of an MKH file in a two-disk system using the DM allocation method is first presented. Based upon this formula, it is shown that our design problem is related to a notoriously difficult problem, namely the Prime Number Problem. Then a performance lower bound and an efficient algorithm for designing optimal MKH files in certain cases are presented. It is pointed out that in some cases the optimal MKH file for ORQs in a two-disk system using the DM allocation method is identical to the optimal MKH file for ORQs in a single-disk system and the optimal average response time in a two-disk system is slightly greater than one half of that in a single-disk system. 相似文献