首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
One of the most important reasoning tasks on queries is checking containment, i.e., verifying whether one query yields necessarily a subset of the result of another one. Query containment is crucial in several contexts, such as query optimization, query reformulation, knowledge-base verification, information integration, integrity checking, and cooperative answering. Containment is undecidable in general for Datalog, the fundamental language for expressing recursive queries. On the other hand, it is known that containment between monadic Datalog queries and between Datalog queries and unions of conjunctive queries are decidable. It is also known that containment between unions of conjunctive two-way regular path queries, which are queries used in the context of semistructured data models containing a limited form of recursion in the form of transitive closure, is decidable. In this paper, we combine the automata-theoretic techniques at the base of these two decidability results to show that containment of Datalog in union of conjunctive two-way regular path queries is decidable in 2EXPTIME. By sharpening a known lower bound result for containment of Datalog in union of conjunctive queries we show also a matching lower bound.  相似文献   

2.
3.
Domain independence and the relational calculus   总被引:1,自引:0,他引:1  
Several alternative semantics (or interpretations) of the relational (domain) calculus are studied here. It is shown that they all have the same expressive power, i.e., the selection of any of the semantics neither gains nor loses expressive power.Since the domain is potentially infinite, the answer to a relational calculus query is sometimes infinite (and hence not a relation). The following approaches which guarantee the finiteness of answers to queries are studied here:output-restricted unlimited interpretation, domain independent queries, output-restricted finite andcountable invention, andlimited interpretation. Of particular interest is the output-restricted unlimited interpretation—although the output is restricted to the active domain of the input and query, the quantified variables range over the infinite underlying domain. While this is close to the intuitive interpretation given to calculus formulas, the naive approach to evaluating queries under this semantics calls for the impossible task of examining infinitely many values. We describe here a constructiion which, given a queryQ under the output-restricted unlimited interpretation, yields a domain independent queryQ, with length no more than exponential in the length ofQ, such thatQ andQ (under their respective semantics) express the same function.This work supported in part by NSF grants IST-85-11541 and IRI-87-19875Work by this author was also supported in part by NSF grant IRI-9109520  相似文献   

4.
In data applications such as information integration, there can be limited access patterns to relations, i.e., binding patterns require values to be specified for certain attributes in order to retrieve data from a relation. As a consequence, we cannot retrieve all tuples from these relations. In this article we study the problem of computing the complete answer to a query, i.e., the answer that could be computed if all the tuples could be retrieved. A query is stable if for any instance of the relations in the query, its complete answer can be computed using the access patterns permitted by the relations. We study the problem of testing stability of various classes of queries, including conjunctive queries, unions of conjunctive queries, and conjunctive queries with arithmetic comparisons. We give algorithms and complexity results for these classes of queries. We show that stability of datalog programs is undecidable, and give a sufficient condition for stability of datalog queries. Finally, we study data-dependent computability of the complete answer to a nonstable query, and propose a decision tree for guiding the process to compute the complete answer.Received: 6 December 2001, Accepted: 25 November 2002, Published online: 3 April 2003Chen Li: This article combines and integrates some content in the technical report at Stanford University [25] and the paper presented in the 8th International Conference on Database Theory (ICDT), London, UK, January, 2001 [28]. In addition to the prior materials, this article contains more results and complete proofs that were not included in the original reports.  相似文献   

5.
When answering queries using external information sources, the contents of the queries can be described by views. To answer a query, we must rewrite it using the set of views presented by the sources. When the external information sources also have the ability to answer some (perhaps limited) sets of queries that require performing operations on their data, the set of views presented by the source may be infinite (albeit encoded in some finite fashion). Previous work on answering queries using views has only considered the case where the set of views is finite. In order to exploit the ability of information sources to answer more complex queries, we consider the problem of answering conjunctive queries using infinite sets of conjunctive views. Our first result is that an infinite set of conjunctive views can be partitioned into a finite number of equivalence classes, such that picking one view from every nonempty class is sufficient to determine whether the query can be answered using the views. Second, we show how to compute the set of equivalence classes for sets of conjunctive views encoded by a datalog program. Furthermore, we extend our results to the case when the query and the views use the built-in predicates <, ⩽, =, and ≠, and they are interpreted over a dense domain. Finally, we extend our results to conjunctive queries and views with the built-in predicates <, ⩽, and = interpreted over the integers. In doing so we present a result of independent interest, namely, an algorithm to minimize such queries.  相似文献   

6.
We present results on the expressive power of various deductive database languages extended with stratified aggregation. We show that (1) Datalog extended with stratified aggregation cannot express a query to count the number of paths between every pair of nodes in an acyclic graph, (2) Datalog extended with stratified aggregation and arithmetic on integers (the + operator) can express allcomputable queries on ordered domains, and (3) Datalog extended with stratified aggregation and generic function symbols can express allcomputable queries (on ordered or unordered domains). Note that without stratified aggregation, the above extensions of Datalog cannot express all computable queries. We show that replacing stratified aggregation by stratified negation preserves expressiveness. We identify subclasses of the above languages that are complete (can express all, and only the, computable queries).  相似文献   

7.
Navigational features have been largely recognized as fundamental for graph database query languages. This fact has motivated several authors to propose RDF query languages with navigational capabilities. In this paper, we propose the query language nSPARQL that uses nested regular expressions to navigate RDF data. We study some of the fundamental properties of nSPARQL and nested regular expressions concerning expressiveness and complexity of evaluation. Regarding expressiveness, we show that nSPARQL is expressive enough to answer queries considering the semantics of the RDFS vocabulary by directly traversing the input graph. We also show that nesting is necessary in nSPARQL to obtain this last result, and we study the expressiveness of the combination of nested regular expressions and SPARQL operators. Regarding complexity of evaluation, we prove that given an RDF graph G and a nested regular expression E, this problem can be solved in time O(|G||E|).  相似文献   

8.
The goal of Knowledge Compilation is to represent a Boolean expression in a format in which it can answer a range of “online-queries” in PTIME. The online-query of main interest to us is model counting, because of its application to query evaluation on probabilistic databases, but other online-queries can be supported as well such as testing for equivalence, testing for implication, etc. In this paper we study the following problem: given a database query q, decide whether its lineage can be compiled efficiently into a given target language. We consider four target languages, of strictly increasing expressive power (when the size of compilation is restricted to be polynomial in the data size): read-once Boolean formulae, OBDD, FBDD and d-DNNF. For each target, we study the class of database queries that admit polynomial size representation: these queries can also be evaluated in PTIME over probabilistic databases. When queries are restricted to conjunctive queries without self-joins, it was known that these four classes collapse to the class of hierarchical queries, which is also the class of PTIME queries over probabilistic databases. Our main result in this paper is that, in the case of Unions of Conjunctive Queries (UCQ), these classes form a strict hierarchy. Thus, unlike conjunctive queries without self-joins, the expressive power of UCQ differs considerably with respect to these target compilation languages. Moreover, we give a complete characterization of the first two target languages, based on the query’s syntax.  相似文献   

9.
We study the problem of querying data sources that accept only a limited set of queries, such as sources accessible by Web services which can implement very large (potentially infinite) families of queries. We revisit a classical setting in which the application queries are conjunctive queries and the source accepts families of conjunctive queries specified as the expansions of a (potentially recursive) Datalog program with parameters.  相似文献   

10.
In this paper, we present a general procedure to test conjunctive query containment. We divide the containment problem into four categories, taking into account the underlying semantics (set or bag theoretic) and the presence or absence of built-in predicates in the queries. After a brief review of previous work on conjunctive query containment, we present a new procedure, called QCC (Query Containment Checker), which we show to be a general and uniform procedure to check the containment among conjunctive queries under the four categories mentioned above. We briefly describe the use of QCC to check bag containment of conjunctive queries, and explain in detail how to use QCC to check set containment of conjunctive queries with built-in predicates. In our conclusions, we point out some uses of QCC for other types of containment. Received: 21 January 2000 / 19 November 2001  相似文献   

11.
12.
An important feature of database technology of the nineties is the use of parallelism for speeding up the execution of complex queries. This technology is being tested in several experimental database architectures and a few commercial systems for conventional select-project-join queries. In particular, hash-based fragmentation is used to distribute data to disks under the control of different processors in order to perform selections and joins in parallel. With the development of new query languages, and in particular with the definition of transitive closure queries and of more general logic programming queries, the new dimension of recursion has been added to query processing. Recursive queries are complex; at the same time, their regular structure is particularly suited for parallel execution, and parallelism may give a high efficiency gain. We survey the approaches to parallel execution of recursive queries that have been presented in the recent literature. We observe that research on parallel execution of recursive queries is separated into two distinct subareas, one focused on the transitive closure of Relational Algebra expressions, the other one focused on optimization of more general Datalog queries. Though the subareas seem radically different because of the approach and formalism used, they have many common features. This is not surprising, because most typical Datalog queries can be solved by means of the transitive closure of simple algebraic expressions. We first analyze the relationship between the transitive closure of expressions in Relational Algebra and Datalog programs. We then review sequential methods for evaluating transitive closure, distinguishing iterative and direct methods. We address the parallelization of these methods, by discussing various forms of parallelization. Data fragmentation plays an important role in obtaining parallel execution; we describe hash-based and semantic fragmentation. Finally, we consider Datalog queries, and present general methods for parallel rule execution; we recognize the similarities between these methods and the methods reviewed previously, when the former are applied to linear Datalog queries. We also provide a quantitative analysis that shows the impact of the initial data distribution on the performance of methods. Recommended by: Patrick Valduriez  相似文献   

13.
Let be the language defined by some deterministick-state automaton with accepting statesF, and letG be a directed graph withn nodes andm labeled arcs. Thedynamic -path problem is to process efficiently and on-line two kinds of operations: (1) inserting arcs intoG, and (2) given two nodesu andv inG, finding a path fromu tov that is labeled by some word of , or reporting that none exists. We present a data structure that supports insertion and regular path existence queries inO(nk 2) amortized time andO(|F|) worst-case time, respectively. Deletions only (no insertions) can also be accommodated in directed acyclic graphs. Finding an -path between two nodes can be done inO(l+|F|) worst-case time, wherel is the length of the path returned. This is an improvement over theO(m) time required to answer queries in the static version of this problem, for each fixed infinite . We show how this data structure and the techniques used for building it are applicable to the area of knowledge base querying. In an amortized setting, we provide relative improvements ofO(m/n) to the time bounds for answering many one-sided recursive queries and even some two-sided recursive queries, such as the same generation query on acyclic graphs.An extended abstract of this paper was presented at the first annual ACM-SIAM Symposium on Discrete Algorithms, San Francisco, January 1990.Work partially completed while at Brown University. Work at Princeton partially supported by the NSF Center in Discrete Mathematics and Theoretical Computer Science (DIMACS).Work supported in part by NSF grant IRI-8617344, by an Alfred P. Sloan Foundation Fellowship, and by ONR grant N00014-83-K-0146, ARPA Order No. 4786.Work supported in part by an NSF Presidential Young Investigator Award with matching funds from IBM, by NSF research grant DCR-8403613, and by ONR grant N00014-83-K-0146, ARPA Order No. 4786.  相似文献   

14.
Temporal logic queries on Datalog and negated Datalog programs are studied, and their relationship to Datalog queries on these programs is explored. It is shown that, in general, temporal logic queries have more expressive power than Datalog queries on Datalog and negated Datalog programs. It is also shown that anexistential domain-independent fragment of temporal logic queries has the same expressive power as Datalog queries on negated Datalog programs with inflationary semantics. This means that for finite structures this class of queries has the power of the fixpoint logic.  相似文献   

15.
Semijoin is a relational operator used in many relational query processing algorithms. Semijoins can be used to “reduce” the database by delimitting portions of the database that contain data relevant to a given query. For some queries, there exist sequences of semijoins that delimit the exact portions of the database needed to answer the query. Such sequences are called full reducers.

This paper considers a class of queries called natural inequality queries (NI queries), and characterizes a subclass for which full reducers exist. We also present an efficient algorithm that decides whether an NI query lies within this subclass, and constructs a full reducer for the query. The NI queries are a subset of the aggregate-free, conjunctive queries of QUEL, and permit join clauses to include <, , =, , >.  相似文献   


16.
In a previous paper (J. Comput. System Sci. 64 (2002) 519), the authors introduced the notion of hypertree decomposition and the corresponding concept of hypertree width and showed that the conjunctive queries whose hypergraphs have bounded hypertree width can be evaluated in polynomial time. Bounded hypertree width generalizes the notions of acyclicity and bounded treewidth and corresponds to larger classes of tractable queries. In the present paper, we provide natural characterizations of hypergraphs and queries having bounded hypertree width in terms of game-theory and logic. First we define the Robber and Marshals game, and prove that a hypergraph H has hypertree width at most k if and only if k marshals have a winning strategy on H, allowing them to trap a robber who moves along the hyperedges. This game is akin the well-known Robber and Cops game (which characterizes bounded treewidth), except that marshals are more powerful than cops: They can control entire hyperedges instead of just vertices. Kolaitis and Vardi (J. Comput. System Sci. 61 (2000) 302) recently gave an elegant characterization of the conjunctive queries having treewidth <k in terms of the k-variable fragment of a certain logic L (=existential-conjunctive fragment of positive FO). We use the Robber and Marshals game to derive a surprisingly simple and equally elegant characterization of the class HW[k] of queries of hypertree width at most k in terms of guarded logic. In particular, we show that HW[k]=GFk(L), where GFk(L) denotes the k-guarded fragment of L. In this fragment, conjunctions of k atoms rather than just single atoms are allowed to act as guards. Note that, for the particular case k=1, our results provide new characterizations of the class of acyclic queries. We extend the notion of bounded hypertree width to nonrecursive stratified Datalog and show that the k-guarded fragment GFk(FO) of first-order logic has the same expressive power as nonrecursive stratified Datalog of hypertree width at most k.  相似文献   

17.
Given a query workload, a database and a set of constraints, the view-selection problem is to select views to materialize so that the constraints are satisfied and the views can be used to compute the queries in the workload efficiently. A typical constraint, which we consider in the present work, is to require that the views can be stored in a given amount of disk space. Depending on features of SQL queries (e.g., the DISTINCT keyword) and on whether the database relations on which the queries are applied are sets or bags, the queries may be computed under set semantics, bag-set semantics, or bag semantics. In this paper we study the complexity of the view-selection problem for conjunctive queries and views under these semantics. We show that bag semantics is the “easiest to handle” (we show that in this case the decision version of view selection is in NP), whereas under set and bag-set semantics we assume further restrictions on the query workload (we only allow queries without self-joins in the workload) to achieve the same complexity. Moreover, while under bag and bag-set semantics filtering views (i.e., subgoals that can be dropped from the rewriting without impacting equivalence to the query) are practically not needed, under set semantics filtering views can reduce significantly the query-evaluation costs. We show that under set semantics the decision version of the view-selection problem remains in NP only if filtering views are not allowed in the rewritings. Finally, we investigate whether the cgalg algorithm for view selection introduced in Chirkova and Genesereth (Linearly bounded reformulations of conjunctive databases, pp. 987–1001, 2000) is suitable in our setting. We prove that this algorithm is sound for all cases we examine here, and that it is complete under bag semantics for workloads of arbitrary conjunctive queries and under bag-set semantics for workloads of conjunctive queries without self-joins. Rada Chirkova’s work on this material has been supported by the National Science Foundation under Grant No. 0307072. The project is co-funded by the European Social Fund (75%) and National Resources (25%)- Operational Program for Educational and Vocational Training II (EPEAEK II) and particularly the program PYTHAGORAS. A preliminary version of this paper appears in F. Afrati, R. Chirkova, M. Gergatsoulis, V. Pavlaki. Designing Views to Efficiently Answer Real SQL Queries. In Proc. of SARA 2005, LNAI Vol. 3607, pages 332-346, Springer-Verlag, 2005.  相似文献   

18.
A natural way for capturing uncertainty in the relational data model is by allowing relations that violate their primary key. A repair of such relation is obtained by selecting a maximal number of tuples without ever selecting two tuples that agree on their primary key. Given a Boolean query q, CERTAINTY(q) is the problem that takes as input a relational database and asks whether q evaluates to true on every repair of that database. In recent years, CERTAINTY(q) has been studied primarily for conjunctive queries. Conditions have been determined under which CERTAINTY(q) is coNP-complete, first-order expressible, or not first-order expressible. A remaining open question was whether there exist conjunctive queries q without self-join such that CERTAINTY(q) is in PTIME but not first-order expressible. We answer this question affirmatively.  相似文献   

19.
Declustering is a common technique used to reduce query response times. Data is declustered over multiple disks and query retrieval can be parallelized. Most of the research on declustering is targeted at spatial range queries and investigates schemes with low additive error. Recently, declustering using replication has been proposed to reduce the additive overhead. Replication significantly reduces retrieval cost of arbitrary queries. In this paper, we propose a disk allocation and retrieval mechanism for arbitrary queries based on design theory. Using the proposed c-copy replicated declustering scheme, buckets can be retrieved using at most k disk accesses. Retrieval algorithm is very efficient and is asymptotically optimal with complexity for a query Q. In addition to the deterministic worst-case bound and efficient retrieval, proposed algorithm handles nonuniform data, high dimensions, supports incremental declustering and has good fault-tolerance property. Experimental results show the feasibility of the algorithm. Recommended by: Sunil Prabhakar  相似文献   

20.
We consider two issues in polynomial-time exact learning of concepts using membership and equivalence queries: (1) errors or omissions in answers to membership queries, and (2) learning finite variants of concepts drawn from a learnable class.To study (1), we introduce two new kinds of membership queries: limited membership queries and malicious membership queries. Each is allowed to give incorrect responses on a maliciously chosen set of strings in the domain. Instead of answering correctly about a string, a limited membership query may give a special I don't know answer, while a malicious membership query may give the wrong answer. A new parameter Lis used to bound the length of an encoding of the set of strings that receive such incorrect answers. Equivalence queries are answered correctly, and learning algorithms are allowed time polynomial in the usual parameters and L. Any class of concepts learnable in polynomial time using equivalence and malicious membership queries is learnable in polynomial time using equivalence and limited membership queries; the converse is an open problem. For the classes of monotone monomials and monotone k-term DNF formulas, we present polynomial-time learning algorithms using limited membership queries alone. We present polynomial-time learning algorithms for the class of monotone DNF formulas using equivalence and limited membership queries, and using equivalence and malicious membership queries.To study (2), we consider classes of concepts that are polynomially closed under finite exceptions and a natural operation to add exception tables to a class of concepts. Applying this operation, we obtain the class of monotone DNF formulas with finite exceptions. We give a polynomial-time algorithm to learn the class of monotone DNF formulas with finite exceptions using equivalence and membership queries. We also give a general transformation showing that any class of concepts that is polynomially closed under finite exceptions and is learnable in polynomial time using standard membership and equivalence queries is also polynomial-time learnable using malicious membership and equivalence queries. Corollaries include the polynomial-time learnability of the following classes using malicious membership and equivalence queries: deterministic finite acceptors, boolean decision trees, and monotone DNF formulas with finite exceptions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号