首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 31 毫秒
When answering queries using external information sources, the contents of the queries can be described by views. To answer a query, we must rewrite it using the set of views presented by the sources. When the external information sources also have the ability to answer some (perhaps limited) sets of queries that require performing operations on their data, the set of views presented by the source may be infinite (albeit encoded in some finite fashion). Previous work on answering queries using views has only considered the case where the set of views is finite. In order to exploit the ability of information sources to answer more complex queries, we consider the problem of answering conjunctive queries using infinite sets of conjunctive views. Our first result is that an infinite set of conjunctive views can be partitioned into a finite number of equivalence classes, such that picking one view from every nonempty class is sufficient to determine whether the query can be answered using the views. Second, we show how to compute the set of equivalence classes for sets of conjunctive views encoded by a datalog program. Furthermore, we extend our results to the case when the query and the views use the built-in predicates <, ⩽, =, and ≠, and they are interpreted over a dense domain. Finally, we extend our results to conjunctive queries and views with the built-in predicates <, ⩽, and = interpreted over the integers. In doing so we present a result of independent interest, namely, an algorithm to minimize such queries.  相似文献   

Uncertainty in deductive databases and logic programming has been modeled using a variety of (numeric and non-numeric) formalisms in the past, including probabilistic, possibilistic, and fuzzy set-theoretic approaches, and many valued logic programming. In this paper, we consider a hybrid approach to the modeling of uncertainty in deductive databases. Our model, called deductive IST (DIST) is based on an extension of the Information Source Tracking (IST) model, recently proposed for relational databases. The DIST model permits uncertainty to be modeled and manipulated in essentially qualitative terms with an option to convert qualitative expressions of uncertainty into numeric form (e.g., probabilities). An uncertain deductive database is modeled as a Horn clause program in the DIST framework, where each fact and rule is annotated with an expression indicating the “sources” contributing to this information and their nature of contribution. (1) We show that positive DIST programs enjoy the least model/least fixpoint semantics analogous to classical logic programs. (2) We show that top-down (e.g., SLD-resolution) and bottom-up (e.g., magic sets rewriting followed by semi-naive evaluation) query processing strategies developed for datalog can be easily extended to DIST programs. (3) Results and techniques for handling negation as failure in classical logic programming can be easily extended to DIST. As an illustration of this, we show how stratified negation can be so extended. We next study the problem of query optimization in such databases and establish the following results. (4) We formulate query containment in qualitative as well as quantitative terms. Intuitively, our qualitative sense of containment would say a query Q1 is contained in a query Q2 provided for every input database D, for every tuple t, t ε Q2(D) holds in every “situation” in which t ε Q1(D) is true. The quantitative notion of containment would say Q1 is contained in Q2 provided on every input, the certainty associated with any tuple computed by Q1 is no more than the certainty associated with the same tuple by Q2 on the given input. We also prove that qualitative and quantitative notions of containment (both absolute and uniform versions) coincide. (5) We establish necessary and sufficient conditions for the qualitative containment of conjunctive queries. (6) We extend the well-known chase technique to develop a test for uniform containment and equivalence of positive DIST programs. (7) Finally, we prove that the complexity of testing containment of conjunctive DIST queries remains the same as in the classical case when number of information sources is regarded as a constant (so, it's NP-complete in the size of the queries). We also show that testing containment of conjunctive queries is co-NP-complete in the number of information sources.  相似文献   

In this paper we study queries over relational databases with integrity constraints (ICs). The main problem we analyze is OWA query answering, i.e., query answering over a database with ICs under open-world assumption. The kinds of ICs that we consider are inclusion dependencies and functional dependencies, in particular key dependencies; the query languages we consider are conjunctive queries and unions of conjunctive queries. We present results about the decidability of OWA query answering under ICs. In particular, we study OWA query answering both over finite databases and over unrestricted databases, and identify the cases in which such a problem is finitely controllable, i.e., when OWA query answering over finite databases coincides with OWA query answering over unrestricted databases. Moreover, we are able to easily turn the above results into new results about implication of ICs and query containment under ICs, due to the deep relationship between OWA query answering and these two classical problems in database theory. In particular, we close two long-standing open problems in query containment, since we prove finite controllability of containment of conjunctive queries both under arbitrary inclusion dependencies and under key and foreign key dependencies. The results of our investigation are very relevant in many research areas which have recently dealt with databases under an incomplete information assumption: e.g., data integration, data exchange, view-based information access, ontology-based information systems, and peer data management systems.  相似文献   

In this paper, we present a general procedure to test conjunctive query containment. We divide the containment problem into four categories, taking into account the underlying semantics (set or bag theoretic) and the presence or absence of built-in predicates in the queries. After a brief review of previous work on conjunctive query containment, we present a new procedure, called QCC (Query Containment Checker), which we show to be a general and uniform procedure to check the containment among conjunctive queries under the four categories mentioned above. We briefly describe the use of QCC to check bag containment of conjunctive queries, and explain in detail how to use QCC to check set containment of conjunctive queries with built-in predicates. In our conclusions, we point out some uses of QCC for other types of containment. Received: 21 January 2000 / 19 November 2001  相似文献   

One of the most important reasoning tasks on queries is checking containment, i.e., verifying whether one query yields necessarily a subset of the result of another one. Query containment is crucial in several contexts, such as query optimization, query reformulation, knowledge-base verification, information integration, integrity checking, and cooperative answering. Containment is undecidable in general for Datalog, the fundamental language for expressing recursive queries. On the other hand, it is known that containment between monadic Datalog queries and between Datalog queries and unions of conjunctive queries are decidable. It is also known that containment between unions of conjunctive two-way regular path queries, which are queries used in the context of semistructured data models containing a limited form of recursion in the form of transitive closure, is decidable. In this paper, we combine the automata-theoretic techniques at the base of these two decidability results to show that containment of Datalog in union of conjunctive two-way regular path queries is decidable in 2EXPTIME. By sharpening a known lower bound result for containment of Datalog in union of conjunctive queries we show also a matching lower bound.  相似文献   

The query containment problem is a fundamental computer science problem which was originally defined for relational queries. With the growing popularity of the sparql query language, it became relevant and important in this new context: reliable and efficient sparql query containment solvers may have various applications within static analysis of queries, especially in the area of query optimizations and refactoring. In this paper, we present a new approach for solving the query containment problem in sparql. The approach is based on reducing the query containment problem to the satisfiability problem in first order logic. It covers a wide range of the sparql language constructs, including union of conjunctive queries, blank nodes, projections, subqueries, clauses from, filter, optional, graph, etc. It also covers containment under rdf schema entailment regime, and it can deal with the subsumption relation. We describe an implementation of the approach, an open source solver SpeCS and its thorough experimental evaluation on two relevant benchmarks, Query Containment Benchmark and SQCFramework. As a side result, SpeCS identified incorrect test cases within both benchmarks, which were manually checked, confirmed and fixed, resulting in better and more reliable benchmarks. The evaluation also shows that SpeCS is highly efficient and that compared to the state-of-the-art solvers, it gives more precise results in a shorter amount of time. In addition, SpeCS has the highest coverage of the supported language constructs.  相似文献   

In data applications such as information integration, there can be limited access patterns to relations, i.e., binding patterns require values to be specified for certain attributes in order to retrieve data from a relation. As a consequence, we cannot retrieve all tuples from these relations. In this article we study the problem of computing the complete answer to a query, i.e., the answer that could be computed if all the tuples could be retrieved. A query is stable if for any instance of the relations in the query, its complete answer can be computed using the access patterns permitted by the relations. We study the problem of testing stability of various classes of queries, including conjunctive queries, unions of conjunctive queries, and conjunctive queries with arithmetic comparisons. We give algorithms and complexity results for these classes of queries. We show that stability of datalog programs is undecidable, and give a sufficient condition for stability of datalog queries. Finally, we study data-dependent computability of the complete answer to a nonstable query, and propose a decision tree for guiding the process to compute the complete answer.Received: 6 December 2001, Accepted: 25 November 2002, Published online: 3 April 2003Chen Li: This article combines and integrates some content in the technical report at Stanford University [25] and the paper presented in the 8th International Conference on Database Theory (ICDT), London, UK, January, 2001 [28]. In addition to the prior materials, this article contains more results and complete proofs that were not included in the original reports.  相似文献   

Nonrecursive incremental evaluation of Datalog queries   总被引:1,自引:0,他引:1  
We consider the problem of repeatedly evaluating the same (computationally expensive) query to a database that is being updated between successive query requests. In this situation, it should be possible to use the difference between successive database states and the answer to the query in one state to reduce the cost of evaluating the query in the next state. We use nonrecursive Datalog (which are unions of conjunctive queries) to compute the differences, and call this process incremental query evaluation using conjunctive queries. After formalizing the notion of incremental query evaluation using conjunctive queries, we give an algorithm that constructs, for each regular chain query (including transitive closure as a special case), a nonrecursive Datalog program to compute the difference between the answer after an update and the answer before the update. We then extend this result to weakly regular queries, which are regular chain programs augmented with conjunctive queries having the so-called Cartesian-closed increment property, and to the case of unbounded-set insertions where the sets are binary Cartesian products. Finally, we show that the class of conjunctive queries with the Cartesian-closed increment property is decidable.Parts of the results in this paper appeared as extended abstracts in theProceedings of the 1992 International Conference on Database Theory (LNCS 646, Springer-Verlag), and in theProceedings of the 1993 International Workshop on Database Programming Languages (Workshops in Computing, Springer-Verlag).Guozhu Dong gratefully acknowledges support of the Australian Research Council through research grants, and the Centre for Intelligen Decision Systems.Work by Jianwen Su was supported in part by NSF Grants IRI-9109520 and IRI-9117094.  相似文献   

We study containment and equivalence of (unions of) conjunctive queries on relations annotated with elements of a commutative semiring. Such relations and the semantics of positive relational queries on them were introduced in a recent paper as a generalization of set semantics, bag semantics, incomplete databases, and databases annotated with various kinds of provenance information. We obtain positive decidability results and complexity characterizations for databases with lineage, why-provenance, and provenance polynomial annotations, for both conjunctive queries and unions of conjunctive queries. At least one of these results is surprising given that provenance polynomial annotations seem “more expressive” than bag semantics and under the latter, containment of unions of conjunctive queries is known to be undecidable. The decision procedures rely on interesting variations on the notion of containment mappings. We also show that for any positive semiring (a very large class) and conjunctive queries without self-joins, equivalence is the same as isomorphism.  相似文献   

Nowadays, huge volumes of data are organized or exported in tree-structured form. Querying capabilities are provided through tree-pattern queries. The need for querying tree-structured data sources when their structure is not fully known, and the need to integrate multiple data sources with different tree structures have driven, recently, the suggestion of query languages that relax the complete specification of a tree pattern. In this paper, we consider a query language that allows the partial specification of a tree pattern. Queries in this language range from structureless keyword-based queries to completely specified tree patterns. To support the evaluation of partially specified queries, we use semantically rich constructs, called dimension graphs, which abstract structural information of the tree-structured data. We address the problem of query containment in the presence of dimension graphs and we provide necessary and sufficient conditions for query containment. As checking query containment can be expensive, we suggest two heuristic approaches for query containment in the presence of dimension graphs. Our approaches are based on extracting structural information from the dimension graph that can be added to the queries while preserving equivalence with respect to the dimension graph. We considered both cases: extracting and storing different types of structural information in advance, and extracting information on-the-fly (at query time). Both approaches are implemented, validated, and compared through experimental evaluation.  相似文献   

In this paper, we introduce a fuzzy language to extract information from the web extending the web query language WebSQL [1]. These extensions are based on two observations: the inadequacy of traditional Boolean query languages for web documents, and the need to move beyond the notion of query providing just a set of answers in order to provide a better data presentation through answers' restructuring. In order to address the first issue, we consider fuzzy sets to express imprecision in data, queries and answers. In our case, data imprecision comes from the data classification provided by several search engines. Query imprecision occurs in weighting values provided at query definition time. Answer imprecision allows to filter and rank the answers. To address the second point, we provide an answer restructuring language to model the restructuring phase that follows the query phase. The restructuring language allows creation/deletion of links and page creation. Thus several answer organizations are possible as a result to the same query. The resulting language extends in a uniform framework WebSQL. Then we provide a mapping for the language constructs into an extended relational algebra called SAMEW[2] expressing similarity-based queries over imprecisely classified data, queries involving navigation among web pages and answer restructurings. Finally, we study the optimization of similarity-based queries using equivalence and containment rules holding for SAMEWand presenting several algorithms for query evaluation.  相似文献   

Numerous frameworks have been proposed in recent years for deductive databases with uncertainty. On the basis of how uncertainty is associated with the facts and rules in a program, we classify these frameworks into implication-based (IB) and annotation-based (AB) frameworks. We take the IB approach and propose a generic framework, called the parametric framework, as a unifying umbrella for IB frameworks. We develop the declarative, fixpoint, and proof-theoretic semantics of programs in our framework and show their equivalence. Using the framework as a basis, we then study the query optimization problem of containment of conjunctive queries in this framework and establish necessary and sufficient conditions for containment for several classes of parametric conjunctive queries. Our results yield tools for use in the query optimization for large classes of query programs in IB deductive databases with uncertainty  相似文献   

Answering queries using views is the problem which examines how to derive the answers to a query when we only have the answers to a set of views. Constructing rewritings is a widely studied technique to derive those answers. In this paper we consider the problem of the existence of rewritings in the case where the answers to the views uniquely determine the answers to the query. Specifically, we say that a view set Vdetermines a query Q if for any two databases D1,D2 it holds: V(D1)=V(D2) implies Q(D1)=Q(D2). We consider the case where query and views are defined by conjunctive queries and investigate the question: If a view set V determines a query Q, is there an equivalent rewriting of Q using V? We present here interesting cases where there are such rewritings in the language of conjunctive queries. Interestingly, we identify a class of conjunctive queries, CQpath, for which a view set can produce equivalent rewritings for “almost all” queries which are determined by this view set. We introduce a problem which relates determinacy to query equivalence. We show that there are cases where restricted results can carry over to broader classes of queries.  相似文献   

We study the power of four query models in the context of property testing in general graphs, where our main case study is the problem of testing k-colorability. Two query types, which have been studied extensively in the past, are pair queries and neighbor queries. The former corresponds to asking whether there is an edge between any particular pair of vertices, and the latter to asking for the i th neighbor of a particular vertex. We show that while for pair queries testing k-colorability requires a number of queries that is a monotone decreasing function in the average degree d, the query complexity in the case of neighbor queries remains roughly the same for every density and for large values of k. We also consider a combined model that allows both types of queries, and we propose a new, stronger, query model, related to the field of Group Testing. We give upper and lower bounds on the query complexity for one-sided error in all the models, where the bounds are nearly tight for three of the models. In some of the cases, our lower bounds extend to two-sided error algorithms. The problem of testing k-colorability was previously studied in the contexts of dense graphs and of sparse graphs, and in our proofs we unify approaches from those cases, and also provide some new tools and techniques that may be of independent interest.  相似文献   

Conjunctive queries (CQs) are at the core of query languages encountered in many logic-based research fields such as AI, or database systems. The majority of existing work assumes set semantics but often in real applications the manipulation of duplicate tuples is required. One of the major problems that arises as part of advanced features of query optimization, data integration, query reformulation and many other research topics is testing for containment of such queries. In this work, we investigate the complexity of query containment problem for CQs under bag semantics (i.e. duplicate tuples are allowed in both the database and the results of queries) and under bag-set semantics (i.e. duplicates are allowed in the result of the queries but not in the database). We derive complexity results for these problems for five major subclasses of CQs; and we also find necessary conditions for CQ query containment. The general case of these problems remains open.  相似文献   

Starting from the XQuery language we define XBind, an XML analog of relational conjunctive queries as well as a related class of XML integrity constraints (dependencies). We identify a fragment of XBind for which containment is decidable, in fact Π2p-complete, and a further fragment for which containment is NP-complete. We extend the containment algorithm to take XML dependencies into account. We give an algorithm for the reformulation of XBind queries under combinations of GAV and LAV XQuery views, as well as additional dependencies. We prove a completeness theorem which guarantees that under certain conditions, our algorithm will find a minimal reformulation if one exists. Moreover, we identify conditions when this algorithm achieves optimal complexity bounds. Our results on containment and reformulation depend on certain restrictions on the query and constraint languages. We calibrate the results by showing that lifting these restrictions significantly changes the complexity of the problems.  相似文献   

Hypertree width is a measure of the degree of cyclicity of hypergraphs. A number of relevant problems from different areas, e.g., the evaluation of conjunctive queries in database theory or the constraint satisfaction in AI, are tractable when their underlying hypergraphs have bounded hypertree width. However, in practical contexts like the evaluation of database queries, we have more information besides the structure of queries. For instance, we know the number of tuples in relations, the selectivity of attributes and so on. In fact, all commercial query-optimizers are based on quantitative methods and do not care on structural properties.In this paper, in order to combine structural decomposition methods with quantitative approaches, the notion of weighted hypertree decomposition is defined. Weighted hypertree decompositions are equipped with cost functions, that can be used for modeling many situations where there is further information on the given problem, besides its hypergraph representation. The complexity of computing hypertree decompositions having the smallest weights, called minimal hypertree decompositions, is analyzed. It is shown that in many cases tractability is lost if weights are added. However, it is proven that, under some—not very severe—restrictions on the allowed cost functions and on the target hypertrees, optimal weighted hypertree decompositions can be computed in polynomial time. For some easier hypertree weighting functions, this problem is also highly parallelizable. Then, a cost function modeling query evaluation costs is provided, and it is shown how to exploit weighted hypertree decompositions for determining (logical) query plans for answering conjunctive queries. Finally, some preliminary results of an experimental comparison of this query optimization technique with the query optimizer of a commercial DBMS are presented.  相似文献   

Abstract. In meta-searchers accessing distributed Web-based information repositories, performance is a major issue. Efficient query processing requires an appropriate caching mechanism. Unfortunately, standard page-based as well as tuple-based caching mechanisms designed for conventional databases are not efficient on the Web, where keyword-based querying is often the only way to retrieve data. In this work, we study the problem of semantic caching of Web queries and develop a caching mechanism for conjunctive Web queries based on signature files. Our algorithms cope with both relations of semantic containment and intersection between a query and the corresponding cache items. We also develop the cache replacement strategy to treat situations when cached items differ in size and contribution when providing partial query answers. We report results of experiments and show how the caching mechanism is realized in the Knowledge Broker system. Received June 15, 1999 / Accepted December 24, 1999  相似文献   

大规模领域本体的快速发展对语义Web领域的数据访问提出了更高的要求,而基本的本体推理服务已不能满足数据密集型应用中处理复杂查询(主要是合取查询)的迫切需要.为此,大量的研究工作集中在本体和描述逻辑知识库合取查询算法的设计实现上,并开发出了很多知识库存储和查询的实用工具.近来模糊本体和模糊描述逻辑的研究,特别是它们在处理语义Web中模糊信息方面,得到了广泛关注.文中重点研究了模糊SH这一族极富表达能力的描述逻辑知识库的合取查询问题,提出了相应的基于推演表的算法,证明了算法对于f-SHOIQ的真子逻辑的可靠性、完备性和可终止性.证明了算法对于f-SHOIQ是可靠的,并分析了导致算法不可终止的原因.对于该问题的数据复杂度,证明了当查询中不存在传递角色时其严格的CONP上限.对于联合复杂度,汪明了算法关于知识库和查询大小的CO3NEXPTIME时间复杂度上限.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号