首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 406 毫秒
1.
Performance studies show that traditional semi-join processing methods are sometimes inefficient because of the storage and processing overhead. To remedy this problem, a new semi-join processing method, called one-shot semi-join execution is proposed. This method allows parallel generation of all the semi-join projections, parallel transmission of all the semi-join projections, and parallel execution of all the semi-joins. The authors apply this method to optimize the response time for processing distributed queries. A response time model is established, which considers both data transmission time and local processing time. Based on this model, an efficient query processing algorithm is developed and analyzed  相似文献   

2.
须德  张彤 《软件学报》1993,4(2):58-64
本文对半连接运算进行扩展,提出一个新的循环查询求解方法——标志位映射法,该方法能将循环查询中的所有关系完全化简,代价为5n—4次相邻结点间的数据传输,其中n为查询图中的结点数。  相似文献   

3.
Indexing is one of the most important techniques to facilitate query processing over a multi-dimensional dataset. A commonly used strategy for such indexing is to keep the tree-structured index balanced. This strategy reduces query processing cost in the worst case, and can handle all different queries equally well. In other words, this strategy implies that all queries are uniformly issued, which is partially because the query distribution is not possibly known and will change over time in practice. A key issue we study in this work is whether it is the best to fully rely on a balanced tree-structured index in particular when datasets become larger and larger in the big data era. This means that, when a dataset becomes very large, it becomes unreasonable to assume that all data in any subspace are equally important and are uniformly accessed by all queries at the index level. Given the existence of query skew and the possible changes of query skew, in this paper, we study how to handle such query skew and such query skew changes at the index level without sacrifice of supporting any possible queries in a wellbalanced tree index and without a high overhead. To tackle the issue, we propose index-view at the index level, where an index-view is a short-cut in a balanced tree-structured index to access objects in the subspaces that are more frequently accessed, and propose a new index-view-centric framework for query processing using index-views in a bottom-up manner. We study index-views selection problem in both static and dynamic setting, and we confirm the effectiveness of our approach using large real and synthetic datasets.  相似文献   

4.
The interest for multimedia database management systems has grown rapidly due to the need for the storage of huge volumes of multimedia data in computer systems. An important building block of a multimedia database system is the query processor, and a query optimizer embedded to the query processor is needed to answer user queries efficiently. Query optimization problem has been widely studied for conventional database systems; however it is a new research area for multimedia database systems. Due to the differences in query processing strategies, query optimization techniques used in multimedia database systems are different from those used in traditional databases. In this paper, a query optimization strategy is proposed for processing spatio-temporal queries in video database systems. The proposed strategy includes reordering algorithms to be applied on query execution tree. The performance results obtained by testing the reordering algorithms on different query sets are also presented.  相似文献   

5.
We investigate the problem of processing historical queries on a sensor network. Since data is considered to have been already collected at the sensor nodes, the main issue is exploring the spatial component of the query in order to minimize its cost represented by the energy consumption. We assume queries can be issued at any network node, i.e., there is no central base station and all nodes have only local knowledge of the network. On the one hand, a globally optimum query processing plan is desirable but its construction is not possible due to the lack of global knowledge of the network. On the other hand, while a simple network flooding is feasible, it is not a practical choice from a cost perspective. To address this problem we propose a two-phase query processing strategy, where in the first phase a path from the query originator to the query region is found and in the second phase the query is processed within the query region itself. This strategy is supported by analytical models that are used to dynamically select the best processing strategy depending on the query specifics. Our extensive analytical and experimental results show that our analytical models are accurate and that the two-phase strategy is better suited for small to medium sized queries, being up to 10 times more cost effective than a typical network flooding. In addition, the dynamic selection of a query processing technique proved itself capable of always delivering at least as good performance as the most energy efficient strategy for all query sizes. Research supported in part by NSERC Canada.  相似文献   

6.
《Information Systems》1999,24(7):597-612
Query rewriting using views is a technique for determining how a query may be answered using a given set of resources, which may include materialized views, cached results of previous queries, or queries answerable by other databases. The power of query rewriting can be considerably enhanced by taking into account integrity constraints that are known to hold on base relations. This paper describes an extension of query rewriting that utilizes inclusion dependencies to find rewritings of queries that would otherwise be overlooked. We describe a complete strategy for finding rewritings in the presence of inclusion dependencies and present a basic algorithm that implements that strategy. We also describe extensions to this algorithm when both inclusion and functional dependencies are considered.  相似文献   

7.
张凡  熊志平  胡运发 《计算机工程》2006,32(10):66-67,70
树模式是查询树型结构数据如XML和LDAP的天然模型。在一个给定的数据库上进行查询,查询的效率很大程度上依赖于查询的大小。因此,在查询前删除查询中的冗余分支,使查询最小化是非常重要的。在树型结构数据库中,存在孩子必需、后代必需和子类3种完整性约束是十分普遍的。针对存在这3种完整性约束的情况,基于扩展的模拟概念提出了一种复杂度为O(n^2)的最小化树模式查询算法(n为树模式查询的节点数)。分析结果表明这个算法的效率要远高于同类算法。  相似文献   

8.
As RDF data continue to gain popularity, we witness the fast growing trend of RDF datasets in both the number of RDF repositories and the size of RDF datasets. Many known RDF datasets contain billions of RDF triples (subject, predicate and object). One of the grant challenges for managing these huge RDF data is how to execute RDF queries efficiently. In this paper, we address the query processing problems against the billion triple challenges. We first identify some causes for the problems of existing query optimization schemes, such as large intermediate results, initial query cost estimation errors. Then, we present our block-oriented dynamic query plan generation approach powered with pipelining execution. Our approach consists of two phases. In the first phase, a near-optimal execution plan for queries is chosen by identifying the processing blocks of queries. We group the join patterns sharing a join variable into building blocks of the query plan since executing them first provides opportunities to reduce the size of intermediate results generated. In the second phase, we further optimize the initial pipelining for a given query plan. We employ optimization techniques, such as sideways information passing and semi-join, to further reduce the size of intermediate results, improve the query processing cost estimation and speed up the performance of query execution. Experimental results on several RDF datasets of over a billion triples demonstrate that our approach outperforms existing RDF query engines that rely on dynamic programming based static query processing strategies.  相似文献   

9.
Top-k query in a wireless sensor network is to find the k sensor nodes with the highest sensing values. To evaluate the top-k query in such an energy-constrained network poses great challenges, due to the unique characteristics imposed on its sensors. Existing solutions for top-k query in the literature mainly focused on energy efficiency but little attention has been paid to the query response time and its effect on the network lifetime. In this paper we address the query response time and its effect on the network lifetime through the study of the top-k query problem in sensor networks with the response time constraint. We aim at finding an energy-efficient routing tree and evaluating top-k queries on the tree such that the network lifetime is significantly prolonged, provided that the query response time constraint is met too. To do so, we first present a cost model of energy consumption for answering top-k queries and introduce the query response time definition. We then propose a novel joint query optimization framework, which consists of finding a routing tree in the network and devising a filter-based evaluation algorithm for top-k query evaluation on the tree. We finally conduct extensive experiments by simulation to evaluate the performance of the proposed algorithms, in terms of the total energy consumption, the maximum energy consumption among nodes, the query response time, and the network lifetime. The experimental results showed that there is a non-trivial tradeoff between the query response time and the network lifetime, and the joint query optimization framework can prolong the network lifetime significantly under a specified query response time constraint.  相似文献   

10.
Zheng  Shihui  Zhou  Aoying  Zhang  Long  Lu  Hongjun 《World Wide Web》2003,6(2):233-253
XML has been recognized as a promising language for data exchange over the Internet. A number of query languages have been proposed for querying XML data. Most of those languages are path-expression based. One difficulty in forming path-expression based queries is that users have to know the structure of XML data against which the queries were issued. In this paper, we describe a DTD-driven visual query interface for XML database systems. With such an interface, a user can easily form path-expression based queries by clicking elements in the DTD tree displayed on the screen and supplying conditions if necessary. The interface and the query generation process are described in detail.  相似文献   

11.
This study proposes a method of in-network aggregate query processing to reduce the number of messages incurred in a wireless sensor network. When aggregate queries are issued to the resource-constrained wireless sensor network, it is important to efficiently perform these queries. Given a set of multiple aggregate queries, the proposed approach shares intermediate results among queries to reduce the number of messages. When the sink receives multiple queries, it should be propagated these queries to a wireless sensor network via existing routing protocols. The sink could obtain the corresponding topology of queries and views each query as a query tree. With a set of query trees collected at the sink, it is necessary to determine a set of backbones that share intermediate results with other query trees (called non-backbones). First, it is necessary to formulate the objective cost function for backbones and non-backbones. Using this objective cost function, it is possible to derive a reduction graph that reveals possible cases of sharing intermediate results among query trees. Using the reduction graph, this study first proposes a heuristic algorithm BM (standing for Backbone Mapping). This study also develops algorithm OOB (standing for Obtaining Optimal Backbones) that exploits a branch-and-bound strategy to obtain the optimal solution efficiently. This study tests the performance of these algorithms on both synthesis and real datasets. Experimental results show that by sharing the intermediate results, the BM and OOB algorithms significantly reduce the total number of messages incurred by multiple aggregate queries, thereby extending the lifetime of sensor networks.  相似文献   

12.
The application of a combination of join and semi-join operations to minimize the amount of data transmission required for distributed query processing is discussed. Specifically, two important concepts that occur with the use of join operations as reducers in query processing, namely, gainful semi-joins and pure joint attributes, are used. Some semi-joint, though not profitable themselves, may benefit the execution of subsequent join operations as reducers. Such a semi-join is termed a gainful semi-join. In addition, join attributes that are not part of the output attributes are referred to as pure join attributes. They exploit the usefulness of gainful semi-joins and use the removability of pure join attributes to reduce the amount of data transmission required for query processing. Heuristic searches are developed to determine a sequence of join and semi-join reducers for query processing. Results indicate the importance of the approach to combining joins and semi-joins for distributed query processing  相似文献   

13.
随着网格计算技术的快速发展,其应用领域在不断扩大,然而,跨网络分布式数据的联合查询往往成为性能的瓶颈,因此文中从提高分布式数据的联合查询的效率出发,以网格计算这一新型web体系结构为技术平台,研究基于网格服务的查询优化目标、查询优化对象和查询优化策略。文中采用了应用非常广泛的启发式搜索算法来缩小策略空间这一优化策略,并在该策略的基础上提出了一种基于网格服务的2-way半连接查询优化算法,最后在网格计算环境下对该算法与一般的全连接算法做了实验对比,证明基于网格服务的2-way半连接查询优化算法可大大缩短查询响应时间。  相似文献   

14.
One of the challenges of resource discovery in unstructured peer-to-peer grid systems is minimizing network traffic. The network traffic arises by query messages that are broadcasted to other peers in order to find the appropriate resources. Blind search methods that are employed in such systems do not work well because every specific query generates high query traffic, which quickly overwhelms the network. Informed search methods usually use recorded history of previous queries to decide where the new queries should be sent. Such methods can reduce network traffic but do not consider the path length. In this study, a method was proposed in which both the path length and network traffic are considered. This approach reduces the hop numbers and prevents massive flooding of query messages. To do this, it selects optimum neighbor peer(s) in order to optimize query forwarding. The proposed approach uses statistical tables that are obtained from recorded history of previous queries. Then a genetic algorithm is applied to these statistical tables to find the optimum neighbor peer(s). The proposed approach showed that query forwarding through the optimum neighbor peer(s) has a greater probability of finding a requested resource with lower hop numbers. This method was compared with random walk and flooding approaches. It was observed that the network traffic remarkably decreased in comparison to a flooding approach, whereas it was similar to the results obtained by a random walk method. Moreover, this method provided a higher success rate in comparison to the random walk approach, whereas it was similar to the results obtained by a flooding method.  相似文献   

15.
This paper proposes a two-level P2P caching strategy for Web search queries. The design is suitable for a fully distributed service platform based on managed peer boxes (set-top-box or DSL/cable modem) located at the edge of the network, where both boxes and access bandwidth to those boxes are controlled and managed by an ISP provider. Our solution significantly reduces user query traffic going outside of the ISP provider to get query results from the respective Web search engine. Web users are usually very reactive to worldwide events which cause highly dynamic query traffic patterns leading to load imbalance across peers. Our solution contains a strategy to quickly ease imbalance on peers and spread communication flow among participating peers. Each peer maintains a local result cache used to keep the answers for queries originated in the peer itself and queries for which the peer is responsible for by contacting the Web search engine on-demand. When query traffic is predominantly routed to a few responsible peers our strategy replicates the role of “being responsible for” to neighboring peers so that they can absorb query traffic. This is a fairly slow and adaptive process that we call mid-term load balancing. To achieve a short-term fair distribution of queries we introduce a location cache in each peer which keeps pointers to peers that have already requested the same queries in the recent past. This lets these peers share their query answers with newly requesting peers. This process is fast as these popular queries are usually cached in the first DHT hop of a requesting peer which quickly tends to redistribute load among more and more peers.  相似文献   

16.
Batch Nearest Neighbor Search for Video Retrieval   总被引:2,自引:0,他引:2  
To retrieve similar videos to a query clip from a large database, each video is often represented by a sequence of high- dimensional feature vectors. Typically, given a query video containing m feature vectors, an independent nearest neighbor (NN) search for each feature vector is often first performed. After completing all the NN searches, an overall similarity is then computed, i.e., a single content-based video retrieval usually involves m individual NN searches. Since normally nearby feature vectors in a video are similar, a large number of expensive random disk accesses are expected to repeatedly occur, which crucially affects the overall query performance. Batch nearest neighbor (BNN) search is stated as a batch operation that performs a number of individual NN searches. This paper presents a novel approach towards efficient high-dimensional BNN search called dynamic query ordering (DQO) for advanced optimizations of both I/O and CPU costs. Observing the overlapped candidates (or search space) of a pervious query may help to further reduce the candidate sets of subsequent queries, DQO aims at progressively finding a query order such that the common candidates among queries are fully utilized to maximally reduce the total number of candidates. Modelling the candidate set relationship of queries by a candidate overlapping graph (COG), DQO iteratively selects the next query to be executed based on its estimated pruning power to the rest of queries with the dynamically updated COG. Extensive experiments are conducted on real video datasets and show the significance of our BNN query processing strategy.  相似文献   

17.
Xyleme is a huge warehouse integrating XML data of the Web. Xyleme considers a simple data model with data trees and tree types for describing the data sources, and a simple query language based on tree queries with boolean conditions. The main components of the data model are a mediated schema modeled by an abstract tree type, as a view of a set of tree types associated with actual data trees, called concrete tree types, and a mapping expressing the connection between the mediated schema and the concrete tree types. The first contribution of this paper is formal: we provide a declarative model-theoretic semantics for Xyleme tree queries, a way of checking tree query containment, and a characterization of tree queries as a composition of branch queries. The other contributions are algorithmic and handle the potentially huge size of the mapping relation which is a crucial issue for semantic integration and query evaluation in Xyleme. First, we propose a method for pre-evaluating queries at compile time by storing some specific meta-information about the mapping into map translation tables. These map translation tables summarize the set of all the branch queries that can be generated from the mediated schema and the set of all the mappings. Then, we propose different operators and strategies for relaxing queries which, having an empty map translation table, will have no answer if they are evaluated against the data. Finally, we present a method for semi-automatically generating the mapping relation.  相似文献   

18.
Nowadays, huge volumes of data are organized or exported in tree-structured form. Querying capabilities are provided through tree-pattern queries. The need for querying tree-structured data sources when their structure is not fully known, and the need to integrate multiple data sources with different tree structures have driven, recently, the suggestion of query languages that relax the complete specification of a tree pattern. In this paper, we consider a query language that allows the partial specification of a tree pattern. Queries in this language range from structureless keyword-based queries to completely specified tree patterns. To support the evaluation of partially specified queries, we use semantically rich constructs, called dimension graphs, which abstract structural information of the tree-structured data. We address the problem of query containment in the presence of dimension graphs and we provide necessary and sufficient conditions for query containment. As checking query containment can be expensive, we suggest two heuristic approaches for query containment in the presence of dimension graphs. Our approaches are based on extracting structural information from the dimension graph that can be added to the queries while preserving equivalence with respect to the dimension graph. We considered both cases: extracting and storing different types of structural information in advance, and extracting information on-the-fly (at query time). Both approaches are implemented, validated, and compared through experimental evaluation.  相似文献   

19.
Advanced application domains such as computer-aided design, computer-aided software engineering, and office automation are characterized by their need to store, retrieve, and manage large quantities of data having complex structures. A number of object-oriented database management systems (OODBMS) are currently available that can effectively capture and process the complex data. The existing implementations of OODBMS outperform relational systems by maintaining and querying cross-references among related objects. However, the existing OODBMS still do not meet the efficiency requirements of advanced applications that require the execution of complex queries involving the retrieval of a large number of data objects and relationships among them. Parallel execution can significantly improve the performance of complex OO queries. In this paper, we analyze the performance of parallel OO query processing algorithms for various benchmark application domains. The application domains are characterized by specific mixes of queries of different semantic complexities. The performance of the application domains has been analyzed for various system and data parameters by running parallel programs on a 32-node transputer based parallel machine developed at the IBM Research Center at Yorktown Heights. The parallel processing algorithms, data routing techniques, and query management and control strategies have been implemented to obtain accurate estimation of controlling and processing overheads. However, generation of large complex databases for the study was impractical. Hence, the data used in the simulation have been parameterized. The parallel OO query processing algorithms analyzed in this study are based on a query graph approach rather than the traditional query tree approach. Using the query graph approach, a query is processed by simultaneously initiating the execution at several object classes, thereby, improving the parallelism. During processing, the algorithms avoid the execution of time-consuming join operations by making use of the object references among the objects. Further, the algorithms do not generate any temporary data, thereby, reducing disk accesses. This is accomplished by marking the selected objects and by employing a two-phase query processing strategy.  相似文献   

20.
In this paper, we present a federated query processing approach to evaluate queries on an Object-Oriented (OO) federated database. This approach has been designed and implemented in the OO-Myriad project, which is an OO extension to the Myriad FDBS researchmyriad:94. Since data integration is performed as part of federated query processing, we have proposed outerjoin, outer-difference and generalized attribute derivation operations together with the traditional relational operations, to be used for integration purposes. To define an OO federated database as a virtual view on multiple OO export databases, we adopt a database mapping strategy that systematically derives each of the class extents, deep class extents and relationships of the federated database using an operator tree consisting of the integration operations. By augmenting federated database queries with this algebraic mapping information, query execution plans can be generated. Based on the original Myriad query processing framework, we have realized the proposed OO federated query processing approach in the OO-Myriad prototype.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号