期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Adaptive holistic scheduling for query processing in sensor networks

Hejun Wu Qiong Luo 《Journal of Parallel and Distributed Computing》2010

We observe two deficiencies of current query processing and scheduling techniques for sensor networks: (1) A query execution plan does not adapt to the hardware characteristics of sensing devices; and (2) the data communication schedule of each node is not adapted to the query runtime workload. Both cause time and energy waste in query processing in sensor networks. To address this problem, we propose an adaptive holistic scheduler, AHS, to run on each node in a wireless sensor network. AHS schedules both the query evaluation and the wireless communication operations, and is able to adapt the schedule to the runtime dynamics of these operations on each node. We have implemented AHS and tested it on real motes as well as in simulation. Our results show that AHS improves the performance of query processing in various dynamic settings. 相似文献

2.

Adaptive workload allocation in query processing in autonomous heterogeneous environments

Anastasios Gounaris Jim Smith Norman W. Paton Rizos Sakellariou Alvaro A. A. Fernandes Paul Watson 《Distributed and Parallel Databases》2009,25(3):125-164

The increasing prevalence of networked storage and computational resources, along with middleware for managing resource access and sharing, raises the prospect that queries can be run over resources obtained on demand, rather than on dedicated infrastructures. However, the movement of query processing into non-dedicated environments means that it is necessary to take account of the partial information and unstable conditions that characterise autonomous, shared, distributed settings. Thus, query processing on grid platforms needs to be adaptive, revising evaluation strategies at query runtime in response to the evolving environment, such as changes to machine load and availability. To address this challenge, adaptive techniques are described that: (i) balance load across plan partitions supporting intra-operator parallelism; (ii) remove bottlenecks in pipelined plans supporting inter-operator parallelism; and (iii) combine the two aforementioned techniques. The approach has been empirically evaluated in a grid-enabled adaptive query processor. 相似文献

3.

An intelligent query processing for distributed ontologies

Jihyun Lee Author Vitae Jun-Ki Min^{Author Vitae} 《Journal of Systems and Software》2010,83(1):85-95

In this paper, we propose an intelligent distributed query processing method considering the characteristics of a distributed ontology environment. We suggest more general models of the distributed ontology query and the semantic mapping among distributed ontologies compared with the previous works. Our approach rewrites a distributed ontology query into multiple distributed ontology queries using the semantic mapping, and we can obtain the integrated answer through the execution of these queries. Furthermore, we propose a distributed ontology query processing algorithm with several query optimization techniques: pruning rules to remove unnecessary queries, a cost model considering site load balancing and caching, and a heuristic strategy for scheduling plans to be executed at a local site. Finally, experimental results show that our optimization techniques are effective to reduce the response time. 相似文献

4.

Query optimization via contention space partitioning and cost error controlling for dynamic multidatabase systems

Qiang Zhu Jaidev Haridas Wen-Chi Hou 《Distributed and Parallel Databases》2008,23(2):151-188

A multidatabase system (MDBS) integrates information from multiple autonomous local databases. Performing global query optimization to achieve efficient query processing in such a system is challenging due to local autonomy of the data sources. Dynamic factors in the environment make the problem even more difficult. In this paper, we present two techniques, i.e., contention space partitioning and cost error controlling, to perform global query optimization in a dynamic MDBS. Both techniques generate an execution plan with multiple versions for a query in a dynamic MDBS, utilizing the multistate cost models built for the dynamic environment via our previous multistate query sampling method. The first technique partitions the contention space of a dynamic multidatabase environment into a given number of subspaces and chooses a good query execution plan version for each subspace, while the second technique selects a set of execution plan versions by using a given error tolerance to control query execution costs. Experiments demonstrate that the proposed techniques are quite promising for performing global query optimization in a dynamic MDBS. Compared with related work on dynamic query optimization, our approach has an advantage of avoiding the high overhead for modifying or re-generating an execution plan for a query based on dynamic runtime information. Research was supported by the US National Science Foundation under Grant # IIS-9811980 and The University of Michigan. 相似文献

5.

Disseminating streaming data in a dynamic environment: an adaptive and cost-based approach

Yongluan Zhou Beng Chin Ooi Kian-Lee Tan 《The VLDB Journal The International Journal on Very Large Data Bases》2008,17(6):1465-1483

In a distributed stream processing system, streaming data are continuously disseminated from the sources to the distributed processing servers. To enhance the dissemination efficiency, these servers are typically organized into one or more dissemination trees. In this paper, we focus on the problem of constructing dissemination trees to minimize the average loss of fidelity of the system. We observe that existing heuristic-based approaches can only explore a limited solution space and hence may lead to sub-optimal solutions. On the contrary, we propose an adaptive and cost-based approach. Our cost model takes into account both the processing cost and the communication cost. Furthermore, as a distributed stream processing system is vulnerable to inaccurate statistics, runtime fluctuations of data characteristics, server workloads, and network conditions, we have designed our scheme to be adaptive to these situations: an operational dissemination tree may be incrementally transformed to a more cost-effective one. Our adaptive strategy employs distributed decisions made by the distributed servers independently based on localized statistics collected by each server at runtime. For a relatively static environment, we also propose two static tree construction algorithms relying on apriori system statistics. These static trees can also be used as initial trees in a dynamic environment. We apply our schemes to both single- and multi-object dissemination. Our extensive performance study shows that the adaptive mechanisms are effective in a dynamic context and the proposed static tree construction algorithms perform close to optimal in a static environment. 相似文献

6.

Self-monitoring query execution for adaptive query processing

Anastasios Norman W. Alvaro A. A. Rizos 《Data & Knowledge Engineering》2004,51(3):325-348

Adaptive query processing generally involves a feedback loop comprising monitoring, assessment and response. So far, individual proposals have tended to group together an approach to monitoring, a means of assessment, and a form of response. However, there are many benefits in decoupling these three phases, and in constructing generic frameworks for each of them. To this end, this paper discusses monitoring of query plan execution as a topic in its own right, and advocates an approach based on self-monitoring algebraic operators. This approach is shown to be generic and independent of any specific adaptation mechanism, easily implementable and portable, sufficiently comprehensive, appropriate for heterogeneous distributed environments, and more importantly, capable of driving on-the-fly adaptations of query plan execution. An experimental evaluation of the overheads and of the quality of the results obtained by monitoring is also presented. 相似文献

7.

ObjectGlobe: Ubiquitous query processing on the Internet

R. Braumandl M. Keidl A. Kemper D. Kossmann A. Kreutz S. Seltzsam K. Stocker 《The VLDB Journal The International Journal on Very Large Data Bases》2001,10(1):48-71

We present the design of ObjectGlobe, a distributed and open query processor for Internet data sources. Today, data is published on the Internet via Web servers which have, if at all, very localized query processing capabilities. The goal of the ObjectGlobe project is to establish an open marketplace in which data and query processing capabilities can be distributed and used by any kind of Internet application. Furthermore, ObjectGlobe integrates cycle providers (i.e., machines) which carry out query processing operators. The overall picture is to make it possible to execute a query with – in principle – unrelated query operators, cycle providers, and data sources. Such an infrastructure can serve as enabling technology for scalable e-commerce applications, e.g., B2B and B2C market places, to be able to integrate data and data processing operations of a large number of participants. One of the main challenges in the design of such an open system is to ensure privacy and security. We discuss the ObjectGlobe security requirements, show how basic components such as the optimizer and runtime system need to be extended, and present the results of performance experiments that assess the additional cost for secure distributed query processing. Another challenge is quality of service management so that users can constrain the costs and running times of their queries. Received: 30 October 2000 / Accepted: 14 March 2001 Published online: 7 June 2001 相似文献

8.

Querying a messy web of data with Avalanche

《Journal of Web Semantics》2014

Recent efforts have enabled applications to query the entire Semantic Web. Such approaches are either based on a centralised store or link traversal and URI dereferencing as often used in the case of Linked Open Data. These approaches make additional assumptions about the structure and/or location of data on the Web and are likely to limit the diversity of resulting usages.In this article we propose a technique called Avalanche, designed for querying the Semantic Web without making any prior assumptions about the data location or distribution, schema-alignment, pertinent statistics, data evolution, and accessibility of servers. Specifically, Avalanche finds up-to-date answers to queries over SPARQL endpoints. It first gets on-line statistical information about potential data sources and their data distribution. Then, it plans and executes the query in a concurrent and distributed manner trying to quickly provide first answers.We empirically evaluate Avalanche using the realistic FedBench data-set over 26 servers and investigate its behaviour for varying degrees of instance-level distribution “messiness” using the LUBM synthetic data-set spread over 100 servers. Results show that Avalanche is robust and stable in spite of varying network latency finding first results for 80% of the queries in under 1 s. It also exhibits stability for some classes of queries when instance-level distribution messiness increases. We also illustrate, how Avalanche addresses the other sources of messiness (pertinent data statistics, data evolution and data presence) by design and show its robustness by removing endpoints during query execution. 相似文献

9.

Optimizing large join queries using a graph-based approach 总被引：4，自引：0，他引：4

Chiang Lee Chi-Sheng Shih Yaw-Huei Chen 《Knowledge and Data Engineering, IEEE Transactions on》2001,13(2):298-315

Although many query tree optimization strategies have been proposed in the literature, there still is a lack of a formal and complete representation of all possible permutations of query operations (i.e., execution plans) in a uniform manner. A graph-theoretic approach presented in the paper provides a sound mathematical basis for representing a query and searching for an execution plan. In this graph model, a node represents an operation and a directed edge between two nodes indicates the older of executing these two operations in an execution plan. Each node is associated with a weight and so is an edge. The weight is an expression containing optimization required parameters, such as relation size, tuple size, join selectivity factors. All possible execution plans are representable in this graph and each spanning tree of the graph becomes an execution plan. It is a general model which can be used in the optimizer of a DBMS for internal query representation. On the basis of this model, we devise an algorithm that finds a near optimal execution plan using only polynomial time. The algorithm is compared with a few other popular optimization methods. Experiments show that the proposed algorithm is superior to the others under most circumstances 相似文献

10.

Semi-static operator graphs for accelerated query execution on FPGAs

《Microprocessors and Microsystems》2017

This paper introduces the concept of Semi-static Operator Graphs (SOG) to provide a runtime reconfigurable accelerator for query execution based on a Field Programmable Gate Array (FPGA). Instead of generating an FPGA configuration for a given arbitrary query during system runtime, we deploy a general query structure on the FPGA consisting of multiple small reconfigurable partitions (RP). During deployment of the hybrid database system, for each RP various query operators are prepared as reconfigurable modules (RM). At system runtime, the proposed approach dynamically chooses and reconfigures RMs into the RPs regarding a given query. As a result the reconfiguration overhead during system runtime is significantly reduced and enables the utilization of our hybrid architecture in real-world scenarios. 相似文献

11.

A cross-layer optimized storage system for workflow applications

《Future Generation Computer Systems》2017

This paper proposes using file system custom metadata as a bidirectional communication channel between applications and the storage middleware. This channel can be used to pass hints that enable cross-layer optimizations, an option hindered today by the ossified file-system interface. We study this approach in the context of storage system support for large-scale workflow execution systems: Our workflow-optimized storage system (WOSS), exploits application hints to provide per-file optimized operations, and exposes data location to enable location-aware scheduling. We argue that an incremental adoption path for adopting cross-layer optimizations in storage exists, present the system architecture for a workflow-optimized storage system and its integration with a workflow runtime engine, and evaluate this approach using synthetic and real applications over multiple success metrics (application runtime, generated network stress, and energy). Our performance evaluation demonstrates that this design brings sizeable performance gains. On a large scale cluster (100 nodes), compared to two production class distributed storage systems (Ceph and GlusterFS), WOSS achieves up to 6× better performance for the synthetic benchmarks and 20–40% better application-level performance gain for real applications. 相似文献

12.

Designing function blocks for distributed process planning and adaptive control 总被引：1，自引：0，他引：1

Lihui Wang Yijun Song Qiaoying Gao 《Engineering Applications of Artificial Intelligence》2009,22(7):1127-1138

The objective of this research is to develop methodologies and a framework for distributed process planning and adaptive control using function blocks. Facilitated by a real-time monitoring system, the proposed methodologies can be applied to integrate with functions of dynamic scheduling in a distributed environment. A function block-enabled process planning approach is proposed to handle dynamic changes during process plan generation and execution. This paper focuses mainly on distributed process planning, particularly on the development of a function block designer that can encapsulate generic process plans into function blocks for runtime execution. As function blocks can sense environmental changes on a shop floor, it is expected that a so-generated process plan can adapt itself to the shop floor environment with dynamically optimized solutions for plan execution and process monitoring. 相似文献

13.

A Probe-Based Technique to Optimize Join Queries in Distributed Internet Databases 总被引：1，自引：0，他引：1

Cyrus Shahabi Latifur Khan Dennis McLeod 《Knowledge and Information Systems》2000,2(3):373-385

An adaptive probe-based optimization technique is developed and demonstrated in the context of an Internet-based distributed database environment. More and more common are database systems which are distributed across servers communicating via the Internet where a query at a given site might require data from remote sites. Optimizing the response time of such queries is a challenging task due to the unpredictability of server performance and network traffic at the time of data shipment; this may result in the selection of an expensive query plan using a static query optimizer. We constructed an experimental setup consisting of two servers running the same database management system connected via the Internet. Concentrating on join queries, we demonstrate how a static query optimizer might choose an expensive plan by mistake. This is due to the lack of a priori knowledge of the run-time environment, inaccurate statistical assumptions in size estimation, and neglecting the cost of remote method invocation. These shortcomings are addressed collectively by proposing a probing mechanism. An implementation of our run-time optimization technique for join queries was constructed in the Java language and incorporated into an experimental setup. The results demonstrate the superiority of our probe-based optimization over a static optimization. Received 6 February 1999 / Revised 15 February 2000 / Accepted 10 May 2000 相似文献

14.

并行查询下查询执行计划的选择

裴泽锋牛保宁张锦文 Amjad Muhammad 《计算机应用》2020,40(2):420-425

查询是数据库系统的主要负载,其效率决定了数据库性能的好坏。一个查询存在多种执行计划,当前,查询优化器只能按照数据库系统的配置参数,静态地为查询选择一个较优的执行计划。并行查询间存在复杂多变的资源争用,很难通过配置参数准确反映,而且同一执行计划在不同情景下的效率并不一致。并行查询下执行计划的选择需考虑查询间的相互影响——查询交互。基于此,提出了一种在并行查询下度量查询受查询交互影响大小的标准QIs。针对并行查询下查询执行计划的选择,还提出了一种动态地为查询选择执行计划的方法TRating,该方法通过比较查询组合中按不同执行计划执行的查询受查询交互影响的大小,选择受查询交互影响较小的执行计划作为该查询的较优执行计划。实验结果表明,TRating方法为查询选择较优执行计划的准确率达61%,相比查询优化器提高了25%;而且在为查询选择次优执行计划时,其准确率也高达69%。相似文献

15.

Alternative strategies for Performing Spatial Joins on Web Sources

Cyrus?Shahabi Mohammad R.?Kolahdouzan Email author Maytham?Safar 《Knowledge and Information Systems》2004,6(3):290-314

With the current information explosion on the Web, numerous applications require access to a collection of different but related pieces of distributed geospatial data. In this paper, we focus on one set of such applications that requires efficient support of spatial operations (specifically, spatial join) on distributed non-database sources. The main challenge with this environment is that remote sources are usually read-only and/or do not support spatial queries. Moreover, several of these Web-based applications can tolerate either some level of inaccuracy or progressively filtered (or polished) results. Therefore, conventional distributed spatial join strategies are not applicable or efficient in this environment. To address these challenges, we first break down the process of distributed spatial join operation into three steps: (1) local to remote transfer, (2) remote spatial selection, and (3) local refinement. Then, for each step, we propose and study alternative techniques and by varying their combinations, we generate several query plans. Each plan strives to strike a compromise between efficiency and accuracy. Since the techniques proposed for the first step have significant impact on the overall performance of the query, we specially focus our attention on this step. We propose two heuristics for the first step to reduce either the number of selection queries or the area covered by each selection query. Within a realistic experimental set-up, we show that one heuristic is more appropriate with fast networks and a powerful local server, while the other one is superior in the opposite situation. Our experiments also show that both heuristics outperform approaches based on transmitting either the actual spatial objects or their bounding boxes. Note that the intention of this paper is not to propose a query optimizer to choose one plan over the others. Instead, it serves as a first step towards the design of such an optimizer by concentrating on the design and evaluation of several alternative plans within a realistic experimental set-up. 相似文献

16.

Consistent selectivity estimation via maximum entropy

V. Markl P. J. Haas M. Kutsch N. Megiddo U. Srivastava T. M. Tran 《The VLDB Journal The International Journal on Very Large Data Bases》2007,16(1):55-76

Cost-based query optimizers need to estimate the selectivity of conjunctive predicates when comparing alternative query execution plans. To this end, advanced optimizers use multivariate statistics to improve information about the joint distribution of attribute values in a table. The joint distribution for all columns is almost always too large to store completely, and the resulting use of partial distribution information raises the possibility that multiple, non-equivalent selectivity estimates may be available for a given predicate. Current optimizers use cumbersome ad hoc methods to ensure that selectivities are estimated in a consistent manner. These methods ignore valuable information and tend to bias the optimizer toward query plans for which the least information is available, often yielding poor results. In this paper we present a novel method for consistent selectivity estimation based on the principle of maximum entropy (ME). Our method exploits all available information and avoids the bias problem. In the absence of detailed knowledge, the ME approach reduces to standard uniformity and independence assumptions. Experiments with our prototype implementation in DB2 UDB show that use of the ME approach can improve the optimizer’s cardinality estimates by orders of magnitude, resulting in better plan quality and significantly reduced query execution times. For almost all queries, these improvements are obtained while adding only tens of milliseconds to the overall time required for query optimization. 相似文献

17.

基于多重加权树的并行数据库查询优化方法 总被引：1，自引：0，他引：1

李建中《计算机学报》1998,21(5):401-412

本文提出了一种基于多重加权树的查询优化方法，包括多重加权树并行查询计划模型、并行查询计划的复杂性模型和查询优化处工法。相似文献

18.

面向对象图象处理的研究

下载免费PDF全文

王丽珍完献忠《计算机工程与科学》1997,19(3):18-26

针对当前微机图象处理领域中存在的一些问题，本文提出了一种面向对象的图象处理方法，这对于解决微机图象处理中存在的问题具有一定的参考价值。相似文献

19.

Efficiently adapting graphical models for selectivity estimation

Kostas Tzoumas Amol Deshpande Christian S. Jensen 《The VLDB Journal The International Journal on Very Large Data Bases》2013,22(1):3-27

Query optimizers rely on statistical models that succinctly describe the underlying data. Models are used to derive cardinality estimates for intermediate relations, which in turn guide the optimizer to choose the best query execution plan. The quality of the resulting plan is highly dependent on the accuracy of the statistical model that represents the data. It is well known that small errors in the model estimates propagate exponentially through joins, and may result in the choice of a highly sub-optimal query execution plan. Most commercial query optimizers make the attribute value independence assumption: all attributes are assumed to be statistically independent. This reduces the statistical model of the data to a collection of one-dimensional synopses (typically in the form of histograms), and it permits the optimizer to estimate the selectivity of a predicate conjunction as the product of the selectivities of the constituent predicates. However, this independence assumption is more often than not wrong, and is considered to be the most common cause of sub-optimal query execution plans chosen by modern query optimizers. We take a step towards a principled and practical approach to performing cardinality estimation without making the independence assumption. By carefully using concepts from the field of graphical models, we are able to factor the joint probability distribution over all the attributes in the database into small, usually two-dimensional distributions, without a significant loss in estimation accuracy. We show how to efficiently construct such a graphical model from the database using only two-way join queries, and we show how to perform selectivity estimation in a highly efficient manner. We integrate our algorithms into the PostgreSQL DBMS. Experimental results indicate that estimation errors can be greatly reduced, leading to orders of magnitude more efficient query execution plans in many cases. Optimization time is kept in the range of tens of milliseconds, making this a practical approach for industrial-strength query optimizers. 相似文献

20.

查询树的启发式优化研究

张岩《数字社区&智能家居》2007,3(7):126-127

描述了查询树的启发式优化方法,优化了文献[1]中的查询树,并分析了查询树的执行代价。相似文献