首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The goal of Knowledge Compilation is to represent a Boolean expression in a format in which it can answer a range of “online-queries” in PTIME. The online-query of main interest to us is model counting, because of its application to query evaluation on probabilistic databases, but other online-queries can be supported as well such as testing for equivalence, testing for implication, etc. In this paper we study the following problem: given a database query q, decide whether its lineage can be compiled efficiently into a given target language. We consider four target languages, of strictly increasing expressive power (when the size of compilation is restricted to be polynomial in the data size): read-once Boolean formulae, OBDD, FBDD and d-DNNF. For each target, we study the class of database queries that admit polynomial size representation: these queries can also be evaluated in PTIME over probabilistic databases. When queries are restricted to conjunctive queries without self-joins, it was known that these four classes collapse to the class of hierarchical queries, which is also the class of PTIME queries over probabilistic databases. Our main result in this paper is that, in the case of Unions of Conjunctive Queries (UCQ), these classes form a strict hierarchy. Thus, unlike conjunctive queries without self-joins, the expressive power of UCQ differs considerably with respect to these target compilation languages. Moreover, we give a complete characterization of the first two target languages, based on the query’s syntax.  相似文献   

2.
Wrappers export the schema and data of existing heterogeneous databases and support queries on them. In the context of cooperative information systems, we present a flexible approach to specify the derivation of object-oriented (OO) export databases from local relational databases. Our export database derivation consists of a set of extent derivation structures (EDS) which defines the extent and deep extent of export classes. Having well-defined semantics, the EDS can be readily used in transforming wrapper queries to local queries. Based on the EDS, we developed a wrapper query evaluation strategy which handles OO queries on the export databases. The strategy is unique in that it considers the limited query processing capabilities of local database systems as well as the language constraints on the local query languages.  相似文献   

3.
Semistructured data occur in situations where information lacks a homogeneous structure and is incomplete. Yet, up to now the incompleteness of information has not been reflected by special features of query languages. Our goal is to investigate the principles of queries that allow for incomplete answers. We do not present, however, a concrete query language. Queries over classical structured data models contain a number of variables and constraints on these variables. An answer is a binding of the variables by elements of the database such that the constraints are satisfied. In the present paper, we loosen this concept in so far as we allow also answers that are partial; that is, not all variables in the query are bound by such an answer. Partial answers make it necessary to refine the model of query evaluation. The first modification relates to the satisfaction of constraints: in some circumstances we consider constraints involving unbound variables as satisfied. Second, in order to prevent a proliferation of answers, we only accept answers that are maximal in the sense that there are no assignments that bind more variables and satisfy the constraints of the query. Our model of query evaluation consists of two phases, a search phase and a filter phase. Semistructured databases are essentially labeled directed graphs. In the search phase, we use a query graph containing variables to match a maximal portion of the database graph. We investigate three different semantics for query graphs, which give rise to three variants of matching. For each variant, we provide algorithms and complexity results. In the filter phase, the maximal matchings resulting from the search phase are subjected to constraints, which may be weak or strong. Strong constraints require all their variables to be bound, while weak constraints do not. We describe a polynomial algorithm for evaluating a special type of queries with filter constraints, and assess the complexity of evaluating other queries for several kinds of constraints. In the final part, we investigate the containment problem for queries consisting only of search constraints under the different semantics.  相似文献   

4.
This paper considers the theory of database queries on the complex value data model with external functions. Motivated by concerns regarding query evaluation, we first identify recursive sets of formulas, called embedded allowed, which is a class with desirable properties of “reasonable” queries.We then show that all embedded allowed calculus (or fix-point) queries are domain independent and continuous. An algorithm for translating embedded allowed queries into equivalent algebraic expressions as a basis for evaluating safe queries in all calculus-based query classes has been developed.Finally we discuss the topic of “domain independent query programs”, compare the expressive power of the various complex value query languages and their embedded allowed versions, and discuss the relationship between safety, embedded allowed, and domain independence in the various calculus-based queries.  相似文献   

5.
Providing top-k typical relevant keyword queries would benefit the users who cannot formulate appropriate queries to express their imprecise query intentions. By extracting the semantic relationships both between keywords and keyword queries, this paper proposes a new keyword query suggestion approach which can provide typical and semantically related queries to the given query. Firstly, a keyword coupling relationship measure, which considers both intra- and inter-couplings between each pair of keywords, is proposed. Then, the semantic similarity of different keyword queries can be measured by using a semantic matrix, in which the coupling relationships between keywords in queries are reserved. Based on the query semantic similarities, we next propose an approximation algorithm to find the most typical queries from query history by using the probability density estimation method. Lastly, a threshold-based top-k query selection method is proposed to expeditiously evaluate the top-k typical relevant queries. We demonstrate that our keyword coupling relationship and query semantic similarity measures can capture the coupling relationships between keywords and semantic similarities between keyword queries accurately. The efficiency of query typicality analysis and top-k query selection algorithm is also demonstrated.  相似文献   

6.
Traditional spatial queries return, for a given query object q, all database objects that satisfy a given predicate, such as epsilon range and k-nearest neighbors. This paper defines and studies inverse spatial queries, which, given a subset of database objects Q and a query predicate, return all objects which, if used as query objects with the predicate, contain Q in their result. We first show a straightforward solution for answering inverse spatial queries for any query predicate. Then, we propose a filter-and-refinement framework that can be used to improve efficiency. We show how to apply this framework on a variety of inverse queries, using appropriate space pruning strategies. In particular, we propose solutions for inverse epsilon range queries, inverse k-nearest neighbor queries, and inverse skyline queries. Furthermore, we show how to relax the definition of inverse queries in order to ensure non-empty result sets. Our experiments show that our framework is significantly more efficient than naive approaches.  相似文献   

7.
One of the key difficulties for users in information retrieval is to formulate appropriate queries to submit to the search engine. In this paper, we propose an approach to enrich the user’s queries by additional context. We used the Language Model to build the query context, which is composed of the most similar queries to the query to expand and their top-ranked documents. Then, we applied a query expansion approach based on the query context and the Latent Semantic Analyses method. Using a web test collection, we tested our approach on short and long queries. We varied the number of recommended queries and the number of expansion terms to specify the appropriate parameters for the proposed approach. Experimental results show that the proposed approach improves the effectiveness of the information retrieval system by 19.23 % for short queries and 52.94 % for long queries according to the retrieval results using the original users’ queries.  相似文献   

8.
Top-k query in a wireless sensor network is to find the k sensor nodes with the highest sensing values. To evaluate the top-k query in such an energy-constrained network poses great challenges, due to the unique characteristics imposed on its sensors. Existing solutions for top-k query in the literature mainly focused on energy efficiency but little attention has been paid to the query response time and its effect on the network lifetime. In this paper we address the query response time and its effect on the network lifetime through the study of the top-k query problem in sensor networks with the response time constraint. We aim at finding an energy-efficient routing tree and evaluating top-k queries on the tree such that the network lifetime is significantly prolonged, provided that the query response time constraint is met too. To do so, we first present a cost model of energy consumption for answering top-k queries and introduce the query response time definition. We then propose a novel joint query optimization framework, which consists of finding a routing tree in the network and devising a filter-based evaluation algorithm for top-k query evaluation on the tree. We finally conduct extensive experiments by simulation to evaluate the performance of the proposed algorithms, in terms of the total energy consumption, the maximum energy consumption among nodes, the query response time, and the network lifetime. The experimental results showed that there is a non-trivial tradeoff between the query response time and the network lifetime, and the joint query optimization framework can prolong the network lifetime significantly under a specified query response time constraint.  相似文献   

9.
This paper explores the fidelity of queries issued in pervasive computing networks. A query’s fidelity, or how well its results reflect the state of the environment, can be significantly impacted by dynamics that occur during its distributed execution. We focus on continuous queries that can be built out of sequences of consecutive snapshot queries and show how the fidelity of snapshots can be used to determine the fidelity of continuous queries. This simple notion of continuous query fidelity can be used to adapt query processing to impact quality and cost tradeoffs given the current state of the environment.  相似文献   

10.
Query processing in the uncertain database has become increasingly important due to the wide existence of uncertain data in many real applications. Different from handling precise data, the uncertain query processing needs to consider the data uncertainty and answer queries with confidence guarantees. In this paper, we formulate and tackle an important query, namely probabilistic inverse ranking (PIR) query, which retrieves possible ranks of a given query object in an uncertain database with confidence above a probability threshold. We present effective pruning methods to reduce the PIR search space, which can be seamlessly integrated into an efficient query procedure. Moreover, we tackle the problem of PIR query processing in high dimensional spaces, which reduces high dimensional uncertain data to a lower dimensional space. Furthermore, we study three interesting and useful aggregate PIR queries, that is, MAX, top-m, and AVG? PIRs. Moreover, we also study an important query type, PIR with uncertain query object (namely UQ-PIR), and design specific rules to facilitate the pruning. Extensive experiments have demonstrated the efficiency and effectiveness of our proposed approaches over both real and synthetic data sets, under various experimental settings.  相似文献   

11.
Path queries have been extensively used to query semistructured data, such as the Web and XML documents. In this paper we introduce weighted path queries, an extension of path queries enabling several classes of optimization problems (such as the computation of shortest paths) to be easily expressed. Weighted path queries are based on the notion of weighted regular expression, i.e., a regular expression whose symbols are associated to a weight. We characterize the problem of answering weighted path queries and provide an algorithm for computing their answer. We also show how weighted path queries can be effectively embedded into query languages for XML data to express in a simple and compact form several meaningful research problems.  相似文献   

12.
Answering queries using views is the problem which examines how to derive the answers to a query when we only have the answers to a set of views. Constructing rewritings is a widely studied technique to derive those answers. In this paper we consider the problem of the existence of rewritings in the case where the answers to the views uniquely determine the answers to the query. Specifically, we say that a view set Vdetermines a query Q if for any two databases D1,D2 it holds: V(D1)=V(D2) implies Q(D1)=Q(D2). We consider the case where query and views are defined by conjunctive queries and investigate the question: If a view set V determines a query Q, is there an equivalent rewriting of Q using V? We present here interesting cases where there are such rewritings in the language of conjunctive queries. Interestingly, we identify a class of conjunctive queries, CQpath, for which a view set can produce equivalent rewritings for “almost all” queries which are determined by this view set. We introduce a problem which relates determinacy to query equivalence. We show that there are cases where restricted results can carry over to broader classes of queries.  相似文献   

13.
Given a prior probability distribution over a set of possible oracle functions, we define a number of queries to be useless for determining some property of the function if the probability that the function has the property is unchanged after the oracle responds to the queries. A familiar example is the parity of a uniformly random Boolean-valued function over {1,2,…,N}, for which N−1 classical queries are useless. We prove that if 2k classical queries are useless for some oracle problem, then k quantum queries are also useless. For such problems, which include classical threshold secret sharing schemes, our result also gives a new way to obtain a lower bound on the quantum query complexity, even in cases where neither the function nor the property to be determined is Boolean.  相似文献   

14.
We investigate the typechecking problem for XML queries: statically verifying that every answer to a query conforms to a given output DTD, for inputs satisfying a given input DTD. This problem had been studied by a subset of the authors in a simplified framework that captured the structure of XML documents but ignored data values. We revisit here the typechecking problem in the more realistic case when data values are present in documents and tested by queries. In this extended framework, typechecking quickly becomes undecidable. However, it remains decidable for large classes of queries and DTDs of practical interest. The main contribution of the present paper is to trace a fairly tight boundary of decidability for typechecking with data values. The complexity of typechecking in the decidable cases is also considered.  相似文献   

15.
In this paper, we prove that a query plan is safe in tuple independent probabilistic databases if and only if its every answer tuple is tree structured in probabilistic graphical models. We classify hierarchical queries into core and non-core hierarchical queries and show that the existing methods can only generate safe plans for core hierarchical queries. Inspired by the bucket elimination framework, we give the sufficient and necessary conditions for the answer relation of every candidate sub-query to be used as a base relation. Finally, the proposed algorithm generates safe plans for extensional query evaluation on non-boolean hierarchical queries and invokes the SPROUT algorithm [24] for intensional query evaluation on boolean queries. A case study on the TPC-H benchmark reveals that the safe plans of Q7 and Q8 can be evaluated efficiently. Furthermore, extensive experiments show that safe plans generated by the proposed algorithm scale well.  相似文献   

16.
We study the power of nonadaptive quantum query algorithms, which are algorithms whose queries to the input do not depend on the result of previous queries. First, we show that any bounded-error nonadaptive quantum query algorithm that computes a total boolean function depending on n variables must make Ω(n) queries to the input in total. Second, we show that, if there exists a quantum algorithm that uses k nonadaptive oracle queries to learn which one of a set of m boolean functions it has been given, there exists a nonadaptive classical algorithm using queries to solve the same problem. Thus, in the nonadaptive setting, quantum algorithms for these tasks can achieve at most a very limited speed-up over classical query algorithms.  相似文献   

17.
This paper considers the efficient evaluation of recursive queries expressed using Horn clauses. We define sideways information passing formally and show how a query evaluation algorithm may be defined in terms of sideways information passing and control. We then consider a class of information-passing strategies that suffices to describe most query evaluation algorithms in the database literature, and show that these strategies may always be implemented by rewriting a given program and evaluating the rewritten program bottom-up. We describe in detail several algorithms for rewriting a program. These algorithms generalize the counting and magic-sets algorithms to work with arbitrary programs. Safety and optimality of the algorithms are also considered.  相似文献   

18.
Identifying and interpreting user intent are fundamental to semantic search. In this paper, we investigate the association of intent with individual words of a search query. We propose that words in queries can be classified as either content or intent, where content words represent the central topic of the query, while users add intent words to make their requirements more explicit. We argue that intelligent processing of intent words can be vital to improving the result quality, and in this work we focus on intent word discovery and understanding. Our approach towards intent word detection is motivated by the hypotheses that query intent words satisfy certain distributional properties in large query logs similar to function words in natural language corpora. Following this idea, we first prove the effectiveness of our corpus distributional features, namely, word co-occurrence counts and entropies, towards function word detection for five natural languages. Next, we show that reliable detection of intent words in queries is possible using these same features computed from query logs. To make the distinction between content and intent words more tangible, we additionally provide operational definitions of content and intent words as those words that should match, and those that need not match, respectively, in the text of relevant documents. In addition to a standard evaluation against human annotations, we also provide an alternative validation of our ideas using clickthrough data. Concordance of the two orthogonal evaluation approaches provide further support to our original hypothesis of the existence of two distinct word classes in search queries. Finally, we provide a taxonomy of intent words derived through rigorous manual analysis of large query logs.  相似文献   

19.
Search engine users often encounter the difficulty of phrasing the precise query that could lead to satisfactory search results. Query recommendation is considered an effective assistant in enhancing keyword-based queries in search engines and Web search software. In this paper, we present a Query-URL Bipartite based query reCommendation approach, called QUBiC. It utilizes the connectivity of a query-URL bipartite graph to recommend related queries and can significantly improve the accuracy and effectiveness of personalized query recommendation systems comparing with the conventional pairwise similarity based approach. The main contribution of the QUBiC approach is its three-phase framework for personalized query recommendations. The first phase is the preparation of queries and their search results returned by a search engine, which generates a historical query-URL bipartite collection. The second phase is the discovery of similar queries by extracting a query affinity graph from the bipartite graph, instead of operating on the original bipartite graph directly using biclique-based approach or graph clustering. The query affinity graph consists of only queries as its vertices and its edges are weighted according to a query-URL vector based similarity (dissimilarity) measure. The third phase is the ranking of similar queries. We devise a novel rank mechanism for ordering the related queries based on the merging distances of a hierarchical agglomerative clustering (HAC). By utilizing the query affinity graph and the HAC-based ranking, we are able to capture the propagation of similarity from query to query by inducing an implicit topical relatedness between queries. Furthermore, the flexibility of the HAC strategy makes it possible for users to interactively participate in the query recommendation process, and helps to bridge the gap between the determinacy of actual similarity values and the indeterminacy of users’ information needs, allowing the lists of related queries to be changed from user to user and query to query, thus adaptively recommending related queries on demand. Our experimental evaluation results show that the QUBiC approach is highly efficient and more effective compared to the conventional query recommendation systems, yielding about 13.3 % as the most improvement in terms of precision.  相似文献   

20.
The typical workload in a database system consists of a mix of multiple queries of different types that run concurrently. Interactions among the different queries in a query mix can have a significant impact on database performance. Hence, optimizing database performance requires reasoning about query mixes rather than considering queries individually. Current database systems lack the ability to do such reasoning. We propose a new approach based on planning experiments and statistical modeling to capture the impact of query interactions. Our approach requires no prior assumptions about the internal workings of the database system or the nature and cause of query interactions, making it portable across systems. To demonstrate the potential of modeling and exploiting query interactions, we have developed a novel interaction-aware query scheduler for report-generation workloads. Our scheduler, called QShuffler, uses two query scheduling algorithms that leverage models of query interactions. The first algorithm is optimized for workloads where queries are submitted in large batches. The second algorithm targets workloads where queries arrive continuously, and scheduling decisions have to be made online. We report an experimental evaluation of QShuffler using TPC-H workloads running on IBM DB2. The evaluation shows that QShuffler, by modeling and exploiting query interactions, can consistently outperform (up to 4x) query schedulers in current database systems.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号