期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Optimization of parallel query execution plans in XPRS 总被引：1，自引：0，他引：1

Wei Hong Michael Stonebraker 《Distributed and Parallel Databases》1993,1(1):9-32

In this paper, we describe our approach to optimization of query execution plans in XPRS, a multiuser parallel database system based on a shared memory multiprocessor and a disk array. The main difficulties in this optimization problem are the compile-time unknown parameters such as available buffer size and number of free processors, and the enormous search space of possible parallel plans. We deal with these problems with a novel two phase optimization strategy which dramatically reduces the search space and allows run time parameters without significantly compromising plan optimality. In this paper we present our two phase optimization strategy and give experimental evidence from XPRS benchmarks that indicate that it almost always produces optimal or close to optimal plans. 相似文献

2.

Parametric query optimization

Yannis E. Ioannidis Raymond T. Ng Kyuseok Shim Timos K. Sellis 《The VLDB Journal The International Journal on Very Large Data Bases》1997,6(2):132-151

In most database systems, the values of many important run-time parameters of the system, the data, or the query are unknown at query optimization time. Parametric query optimization attempts to identify at compile time several execution plans, each one of which is optimal for a subset of all possible values of the run-time parameters. The goal is that at run time, when the actual parameter values are known, the appropriate plan should be identifiable with essentially no overhead. We present a general formulation of this problem and study it primarily for the buffer size parameter. We adopt randomized algorithms as the main approach to this style of optimization and enhance them with a sideways information passing feature that increases their effectiveness in the new task. Experimental results of these enhanced algorithms show that they optimize queries for large numbers of buffer sizes in the same time needed by their conventional versions for a single buffer size, without much sacrifice in the output quality and with essentially zero run-time overhead. Edited by S. Zdonik / Received June 1993 / Accepted April 1996 相似文献

3.

Robust heuristic algorithms for exploiting the common tasks of relational cloud database queries

《Applied Soft Computing》2015

Cloud computing enables a conventional relational database system's hardware to be adjusted dynamically according to query workload, performance and deadline constraints. One can rent a large amount of resources for a short duration in order to run complex queries efficiently on large-scale data with virtual machine clusters. Complex queries usually contain common subexpressions, either in a single query or among multiple queries that are submitted as a batch. The common subexpressions scan the same relations, compute the same tasks (join, sort, etc.), and/or ship the same data among virtual computers. The total time spent for the queries can be reduced by executing these common tasks only once. In this study, we build and use efficient sets of query execution plans to reduce the total execution time. This is an NP-Hard problem therefore, a set of robust heuristic algorithms, Branch-and-Bound, Genetic, Hill Climbing, and Hybrid Genetic-Hill Climbing, are proposed to find (near-) optimal query execution plans and maximize the benefits. The optimization time of each algorithm for identifying the query execution plans and the quality of these plans are analyzed by extensive experiments. 相似文献

4.

Using views to generate efficient evaluation plans for queries

《Journal of Computer and System Sciences》2007,73(5):703-724

We study the problem of generating efficient, equivalent rewritings using views to compute the answer to a query. We take the closed-world assumption, in which views are materialized from base relations, rather than views describing sources in terms of abstract predicates, as is common when the open-world assumption is used. In the closed-world model, there can be an infinite number of different rewritings that compute the same answer, yet have quite different performance. Query optimizers take a logical plan (a rewriting of the query) as an input, and generate efficient physical plans to compute the answer. Thus our goal is to generate a small subset of the possible logical plans without missing an optimal physical plan.We first consider a cost model that counts the number of subgoals in a physical plan, and show a search space that is guaranteed to include an optimal rewriting, if the query has a rewriting in terms of the views. We also develop an efficient algorithm for finding rewritings with the minimum number of subgoals. We then consider a cost model that counts the sizes of intermediate relations of a physical plan, without dropping any attributes, and give a search space for finding optimal rewritings. Our final cost model allows attributes to be dropped in intermediate relations. We show that, by careful variable renaming, it is possible to do better than the standard “supplementary relation” approach, by dropping attributes that the latter approach would retain. Experiments show that our algorithm of generating optimal rewritings has good efficiency and scalability. 相似文献

5.

An adaptable distributed query processing architecture

Yongluan Zhou Beng Chin Ooi Kian-Lee Tan Wee Hyong Tok 《Data & Knowledge Engineering》2005,53(3):1-309

Traditionally, distributed query optimization techniques generate static query plans at compile time. However, the optimality of these plans depends on many parameters (such as the selectivities of operations, the transmission speeds and workloads of servers) that are not only difficult to estimate but are also often unpredictable and fluctuant at runtime. As the query processor cannot dynamically adjust the plans at runtime, the system performance is often less than satisfactory. In this paper, we introduce a new highly adaptive distributed query processing architecture. Our architecture can quickly detect fluctuations in selectivities of operations, as well as transmission speeds and workloads of servers, and accordingly change the operation order of a distributed query plan during execution. We have implemented a prototype based on the Telegraph system [Telegragraph project. Available from >]. Our experimental study shows that our mechanism can adapt itself to the changes in the environment and hence approach to an optimal plan during execution. 相似文献

6.

An efficient multiversion access structure 总被引：1，自引：0，他引：1

Varman P.J. Verma R.M. 《Knowledge and Data Engineering, IEEE Transactions on》1997,9(3):391-409

An efficient multiversion access structure for a transaction-time database is presented. Our method requires optimal storage and query times for several important queries and logarithmic update times. Three version operations-inserts, updates, and deletes-are allowed on the current database, while queries are allowed on any version, present or past. The following query operations are performed in optimal query time: key range search, key history search, and time range view. The key-range query retrieves all records having keys in a specified key range at a specified time; the key history query retrieves all records with a given key in a specified time range; and the time range view query retrieves all records that were current during a specified time interval. Special cases of these queries include the key search query, which retrieves a particular version of a record, and the snapshot query which reconstructs the database at some past time. To the best of our knowledge no previous multiversion access structure simultaneously supports all these query and version operations within these time and space bounds. The bounds on query operations are worst case per operation, while those for storage space and version operations are (worst-case) amortized over a sequence of version operations. Simulation results show that good storage utilization and query performance is obtained 相似文献

7.

Using differential technlques to efficiently support transaction time

Christian S. Jensen Ph.D. Leo Mark Ph.D. Nick Roussopoulos Ph.D. Timos Sells Ph.D. 《The VLDB Journal The International Journal on Very Large Data Bases》1993,2(1):75-111

We present an architecture for query processing in the relational model extended with transaction time. The architecture integrates standard query optimization and computation techniques with new differential computation techniques. Differential computation computes a query incrementally or decrementally from the cahced and indexed results of previous computations. The use of differential computation techniques is essential in order to provide efficient processing of queries that access very large temporal relations. Alternative query plans are integrated into a state transition network, where the state space includes backlogs of base relations, cached results from previous computations, a cache index, and intermediate results; the transitions include standard relational algebra operators, operators for constructing differential files, operators for differential computation, and combined operators. A rule set is presented to prune away parts of state transition networks that are not promising, and dynamic programming techniques are used to identify the optimal plans from the remaining state transition networks. An extended logical access path serves as a structuring index on the cached results and contains, in addition, vital statistics for the query optimization process (including statistics about base relations, backlogs, and queries-previously computed and cached, previously computed, or just previously estimated). 相似文献

8.

一种数据立方查询条件优化策略

王元珍李靖《计算机工程》2003,29(2):88-90

当数据立方查询条件不是合取范式时，一般是将它转化成为若干合取范式的并的形式(析取范式)。但如果各合取范式之间有交集，则交集部分的记录会被多次查询。为了解决这个问题，文章提出了一种数据立方查询条件优化策略，把查询条件转化为互不相交的立方块的并的形式。文章详细地讨论了数据立方中互不相交的立方块的划分方法，并给出了该优化策略的实现算法和性能分析。结果表明，当查询条件不是合取范式时，该优化策略明显提高了查询性能。相似文献

9.

A multi-colony ant algorithm for optimizing join queries in distributed database systems

Ladan Golshanara Seyed Mohammad Taghi Rouhani Rankoohi Hamed Shah-Hosseini 《Knowledge and Information Systems》2014,39(1):175-206

Distributed database systems provide a new data processing and storage technology for decentralized organizations of today. Query optimization, the process to generate an optimal execution plan for the posed query, is more challenging in such systems due to the huge search space of alternative plans incurred by distribution. As finding an optimal execution plan is computationally intractable, using stochastic-based algorithms has drawn the attention of most researchers. In this paper, for the first time, a multi-colony ant algorithm is proposed for optimizing join queries in a distributed environment where relations can be replicated but not fragmented. In the proposed algorithm, four types of ants collaborate to create an execution plan. Hence, there are four ant colonies in each iteration. Each type of ant makes an important decision to find the optimal plan. In order to evaluate the quality of the generated plan, two cost models are used—one based on the total time and the other on the response time. The proposed algorithm is compared with two previous genetic-based algorithms on chain, tree and cyclic queries. The experimental results show that the proposed algorithm saves up to about 80 % of optimization time with no significant difference in the quality of generated plans compared with the best existing genetic-based algorithm. 相似文献

10.

Optimal prefix and suffix queries on texts

Maxime Crochemore Costas S. Iliopoulos M. Sohel Rahman 《Information Processing Letters》2008,108(5):320-325

In this paper, we study a restricted version of the position restricted pattern matching problem introduced and studied by Mäkinen and Navarro [V. Mäkinen, G. Navarro, Position-restricted substring searching, in: J.R. Correa, A. Hevia, M.A. Kiwi (Eds.), LATIN, in: Lecture Notes in Computer Science, vol. 3887, Springer, 2006, pp. 703-714]. In the problem handled in this paper, we are interested in those occurrences of the pattern that lies in a suffix or in a prefix of the given text. We achieve optimal query time for our problem against a data structure which is an extension of the classic suffix tree data structure. The time and space complexity of the data structure is dominated by that of the suffix tree. Notably, the (best) algorithm by Mäkinen and Navarro, if applied to our problem, gives sub-optimal query time and the corresponding data structure also requires more time and space. 相似文献

11.

Gröbner bases for polynomial systems with parameters

Antonio Montes Michael Wibmer 《Journal of Symbolic Computation》2010

Gröbner bases are the computational method par excellence for studying polynomial systems. In the case of parametric polynomial systems one has to determine the reduced Gröbner basis in dependence of the values of the parameters. In this article, we present the algorithm GröbnerCover which has as inputs a finite set of parametric polynomials, and outputs a finite partition of the parameter space into locally closed subsets together with polynomial data, from which the reduced Gröbner basis for a given parameter point can immediately be determined. The partition of the parameter space is intrinsic and particularly simple if the system is homogeneous. 相似文献

12.

Solving CSG equations for checking equivalency between two different geometric models 总被引：1，自引：0，他引：1

Zhengdong Huang^{Author Vitae} Shaopeng Tian Author VitaeAuthor Vitae 《Computer aided design》2004,36(10):975-992

For two given parametric constructive solid geometry (CSG) models, the problem of determining their parameter domains in which the two models are equivalent is addressed. Here, two CSG models are equivalent if they represent the exact same region in R³, although their constituent features and feature attributes may differ. In this paper, an approach for solving the problem in a limited scope is proposed, in which a CSG model is polyhedral, its parametric form is explicit and its feature orientations are fixed. The solution includes the equivalent parameter domain for each model and parameter mapping that associates these two models on their equivalent parameter domains. One application of this research is to identify the equivalent parameter domains of two parametric part models, respectively, in the design option space and in the capability envelope set of a parametric machining process in order to facilitate the interoperation between part design and process planning through the generated parameter mapping between these two different kinds of models. 相似文献

13.

On solving efficiently the view selection problem under bag and bag-set semantics

《Information Systems》2014

In this paper, we investigate the problem of view selection for workloads of conjunctive queries under bag and bag-set semantics. In particular, for both semantics we aim to limit the search space of candidate viewsets. We also start delineating boundaries between query workloads for which certain even more restricted search spaces suffice. They suffice in the sense that they do not compromise optimality in that they contain at least one of the optimal solutions. We start with the general case for both bag and bag-set semantics, where we give a tight condition that candidate views can satisfy and still the search space (thus limited) does contain at least one optimal solution. We show that these results, for both semantics, reduce the size of the search space significantly. Further on, due to this analysis for both semantics, a delineation of the space of viewsets and the space of the corresponding equivalent rewritings for a certain conjunctive query workload is given. We show that for chain query workloads under both bag and bag-set semantics, taking only chain views may miss optimal solutions, whereas, if we further limit the queries to be path-queries (i.e., chain queries over a single binary relation), then, under bag semantics, path-views suffice. Concentrating to bag-set semantics, we show that the path-viewsets do not suffice for every path-query workload. 相似文献

14.

Weighted hypertree decompositions and optimal query plans

《Journal of Computer and System Sciences》2007,73(3):475-506

Hypertree width is a measure of the degree of cyclicity of hypergraphs. A number of relevant problems from different areas, e.g., the evaluation of conjunctive queries in database theory or the constraint satisfaction in AI, are tractable when their underlying hypergraphs have bounded hypertree width. However, in practical contexts like the evaluation of database queries, we have more information besides the structure of queries. For instance, we know the number of tuples in relations, the selectivity of attributes and so on. In fact, all commercial query-optimizers are based on quantitative methods and do not care on structural properties.In this paper, in order to combine structural decomposition methods with quantitative approaches, the notion of weighted hypertree decomposition is defined. Weighted hypertree decompositions are equipped with cost functions, that can be used for modeling many situations where there is further information on the given problem, besides its hypergraph representation. The complexity of computing hypertree decompositions having the smallest weights, called minimal hypertree decompositions, is analyzed. It is shown that in many cases tractability is lost if weights are added. However, it is proven that, under some—not very severe—restrictions on the allowed cost functions and on the target hypertrees, optimal weighted hypertree decompositions can be computed in polynomial time. For some easier hypertree weighting functions, this problem is also highly parallelizable. Then, a cost function modeling query evaluation costs is provided, and it is shown how to exploit weighted hypertree decompositions for determining (logical) query plans for answering conjunctive queries. Finally, some preliminary results of an experimental comparison of this query optimization technique with the query optimizer of a commercial DBMS are presented. 相似文献

15.

An Optimal Cache for a Federated Database System

Alfredo Goñi Arantza Illarramendi Eduardo Mena José Miguel Blanco 《Journal of Intelligent Information Systems》1997,9(2):125-155

相似文献

16.

Optimizing large join queries using a graph-based approach 总被引：4，自引：0，他引：4

Chiang Lee Chi-Sheng Shih Yaw-Huei Chen 《Knowledge and Data Engineering, IEEE Transactions on》2001,13(2):298-315

Although many query tree optimization strategies have been proposed in the literature, there still is a lack of a formal and complete representation of all possible permutations of query operations (i.e., execution plans) in a uniform manner. A graph-theoretic approach presented in the paper provides a sound mathematical basis for representing a query and searching for an execution plan. In this graph model, a node represents an operation and a directed edge between two nodes indicates the older of executing these two operations in an execution plan. Each node is associated with a weight and so is an edge. The weight is an expression containing optimization required parameters, such as relation size, tuple size, join selectivity factors. All possible execution plans are representable in this graph and each spanning tree of the graph becomes an execution plan. It is a general model which can be used in the optimizer of a DBMS for internal query representation. On the basis of this model, we devise an algorithm that finds a near optimal execution plan using only polynomial time. The algorithm is compared with a few other popular optimization methods. Experiments show that the proposed algorithm is superior to the others under most circumstances 相似文献

17.

Algebraic identities and query optimization in a parametric modelfor relational temporal databases

Gadia S.K. Nair S.S. 《Knowledge and Data Engineering, IEEE Transactions on》1998,10(5):793-807

This paper presents algebraic identities and algebraic query optimization for a parametric model for temporal databases. The parametric model has several features not present in the classical model. In this model, a key is explicitly designated with a relation, and an operator is available to change the key. The algebra for the parametric model is three-sorted; it includes 1) relational expressions that evaluate to relations, 2) domain expressions that evaluate to time domains, and 3) Boolean expressions that evaluate to TRUE or FALSE. The identities in the parametric model are classified as weak identities and strong identities. Weak identities in this model are largely counterparts of the identities in classical relational databases. Rather than establishing weak identities from scratch, a meta inference mechanism, introduced in the paper, allows weak identities to be induced from their respective classical counterpart. On the other hand, the strong identities will be established from scratch. An algorithm is presented for algebraic optimization to transform a query to an equivalent query that will execute more efficiently 相似文献

18.

Parametrically Generating New Instances of Traditional Chinese Private Gardens that Replicate Selected Socio-Spatial and Aesthetic Properties

Rongrong Yu Michael J. Ostwald Ning Gu 《Nexus Network Journal》2015,17(3):807-829

This paper describes the use of a parametric system for generating garden plans that replicate selected socio-spatial characteristics and aesthetic properties of traditional Chinese private gardens (TCPGs). To achieve this, the spatial characteristics of three historic TCPGs are first mathematically derived using connectivity analysis, a variation of a space syntax technique. The data developed through this process is then used to shape the rules of a parametric system to generate new garden plans with similar spatial connectivity values and structures. While these new plans capture some of the socio-spatial features of the TCPG, the other important characteristic of these gardens is a particular level of visual complexity. Using fractal analysis, the characteristic visual complexity of the newly generated garden plans is then compared with the historic cases, to assess the success of the system in aesthetic terms. Through this three-stage process (syntactical derivation, parametric generation and fractal analysis) the paper demonstrates a method for capturing selected spatial and aesthetic properties in a parametric system and also provides new tools for landscape design in the context of specific historical sites and approaches. 相似文献

19.

物化视图选择的预处理算法 总被引：5，自引：1，他引：4

张柏礼孙志挥孙翔《计算机研究与发展》2004,41(10):1645-1651

现有的静态物化视图选择算法的视图搜索代价较大,而导致算法的时间复杂度偏高,不能用于对物化视图进行在线动态调整．提出了一种物化视图选择的预处理算法——PMVS,其中包括用户查询集动态调整算法QSDM、候选视图格构造算法CVLC和候选视图筛选算法CVF,该算法可用做预处理过程对视图数量进行在线压缩,从而降低了静态算法的视图空间搜索代价和时间复杂度．理论分析和实验结果表明该算法是有效可行的．相似文献

20.

Exploring optimization and caching for efficient collection operations

Venkata Krishna Suhas Nerella Swetha Surapaneni Sanjay K. Madria Thomas Weigert 《Automated Software Engineering》2014,21(1):3-40

Many large programs operate on collection types. Extensive libraries are available in many programming languages, such as the C++ Standard Template Library, which make programming with collections convenient. Extending programming languages to provide collection queries as first class constructs in the language would not only allow programmers to write queries explicitly in their programs but it would also allow compilers to leverage the wealth of experience available from the database domain to optimize such queries. This paper describes an approach to reduce the run time of programs involving explicit collection queries by performing run time query optimization that is effective for single runs of a program. In addition, it also leverages a cache to store previously computed results. The proposed approach relies on histograms built from the data at run time to estimate the selectivity of joins and predicates in order to construct query plans. Information from earlier executions of the same query during run time is leveraged during the construction of the query plans, even when the data has changed between these executions. An effective cache policy is also determined for caching the results of join (sub) queries. The cache is maintained incrementally, when the underlying collections change, and use of the cache space is optimized by a cache replacement policy. Our approach has been implemented within the Java Query Language (JQL) framework using AspectJ. Our approach demonstrated that its run time query optimization in integration with caching sub query result significantly improves the run time of programs with explicit queries over equivalent programs performing collection operations by iterating over those collections. This paper evaluates our approach using synthetic as well as real world Robocode programs by comparing it to JQL as a benchmark. Experimental results show that our approach performs better than the JQL approach with respect to the program run time. 相似文献