首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 593 毫秒
1.
In this paper we study queries over relational databases with integrity constraints (ICs). The main problem we analyze is OWA query answering, i.e., query answering over a database with ICs under open-world assumption. The kinds of ICs that we consider are inclusion dependencies and functional dependencies, in particular key dependencies; the query languages we consider are conjunctive queries and unions of conjunctive queries. We present results about the decidability of OWA query answering under ICs. In particular, we study OWA query answering both over finite databases and over unrestricted databases, and identify the cases in which such a problem is finitely controllable, i.e., when OWA query answering over finite databases coincides with OWA query answering over unrestricted databases. Moreover, we are able to easily turn the above results into new results about implication of ICs and query containment under ICs, due to the deep relationship between OWA query answering and these two classical problems in database theory. In particular, we close two long-standing open problems in query containment, since we prove finite controllability of containment of conjunctive queries both under arbitrary inclusion dependencies and under key and foreign key dependencies. The results of our investigation are very relevant in many research areas which have recently dealt with databases under an incomplete information assumption: e.g., data integration, data exchange, view-based information access, ontology-based information systems, and peer data management systems.  相似文献   

2.
3.
4.
5.
大规模领域本体的快速发展对语义Web领域的数据访问提出了更高的要求,而基本的本体推理服务已不能满足数据密集型应用中处理复杂查询(主要是合取查询)的迫切需要.为此,大量的研究工作集中在本体和描述逻辑知识库合取查询算法的设计实现上,并开发出了很多知识库存储和查询的实用工具.近来模糊本体和模糊描述逻辑的研究,特别是它们在处理语义Web中模糊信息方面,得到了广泛关注.文中重点研究了模糊SH这一族极富表达能力的描述逻辑知识库的合取查询问题,提出了相应的基于推演表的算法,证明了算法对于f-SHOIQ的真子逻辑的可靠性、完备性和可终止性.证明了算法对于f-SHOIQ是可靠的,并分析了导致算法不可终止的原因.对于该问题的数据复杂度,证明了当查询中不存在传递角色时其严格的CONP上限.对于联合复杂度,汪明了算法关于知识库和查询大小的CO3NEXPTIME时间复杂度上限.  相似文献   

6.
As a result of the extensive research in view-based query processing, three notions have been identified as fundamental, namely rewriting, answering, and losslessness. Answering amounts to computing the tuples satisfying the query in all databases consistent with the views. Rewriting consists in first reformulating the query in terms of the views and then evaluating the rewriting over the view extensions. Losslessness holds if we can answer the query by solely relying on the content of the views. While the mutual relationship between these three notions is easy to identify in the case of conjunctive queries, the terrain of notions gets considerably more complicated going beyond such a query class. In this paper, we revisit the notions of answering, rewriting, and losslessness and clarify their relationship in the setting of semistructured databases, and in particular for the basic query class in this setting, i.e., two-way regular path queries. Our first result is a clean explanation of the relationship between answering and rewriting, in which we characterize rewriting as a “linear approximation” of query answering. We show that applying this linear approximation to the constraint-satisfaction framework yields an elegant automata-theoretic approach to query rewriting. As for losslessness, we show that there are indeed two distinct interpretations for this notion, namely with respect to answering, and with respect to rewriting. We also show that the constraint-theoretic approach and the automata-theoretic approach can be combined to give algorithmic characterization of the various facets of losslessness. Finally, we deal with the problem of coping with loss, by considering mechanisms aimed at explaining lossiness to the user.  相似文献   

7.
We consider the problem of retrieving consistent answers over databases that might be inconsistent with respect to a set of integrity constraints. In particular, we concentrate on sets of constraints that consist of key dependencies, and we give an algorithm that computes the consistent answers for a large and practical class of conjunctive queries. Given a query q, the algorithm returns a first-order query Q (called a query rewriting) such that for every (potentially inconsistent) database I, the consistent answers for q can be obtained by evaluating Q directly on I.  相似文献   

8.
The system Queer is a prototype of an information system design tool which directly supports an extended Entity-Relationship model on its front-end and uses a semantically well-founded query and manipulation language based on an Entity-Relationship calculus. The system basically consists of a set of compilers written in Prolog which translate data specifications, schema definitions, queries, integrity constraints and data-manipulation statements into Prolog programs. All features mentioned are implemented in form and extent as described here.  相似文献   

9.
We address the issue of incorporating a particular yet expressive form of integrity constraints (namely, denial constraints) into probabilistic databases. To this aim, we move away from the common way of giving semantics to probabilistic databases, which relies on considering a unique interpretation of the data, and address two fundamental problems: consistency checking and query evaluation. The former consists in verifying whether there is an interpretation which conforms to both the marginal probabilities of the tuples and the integrity constraints. The latter is the problem of answering queries under a “cautious” paradigm, taking into account all interpretations of the data in accordance with the constraints. In this setting, we investigate the complexity of the above-mentioned problems, and identify several tractable cases of practical relevance.  相似文献   

10.
Semistructured data occur in situations where information lacks a homogeneous structure and is incomplete. Yet, up to now the incompleteness of information has not been reflected by special features of query languages. Our goal is to investigate the principles of queries that allow for incomplete answers. We do not present, however, a concrete query language. Queries over classical structured data models contain a number of variables and constraints on these variables. An answer is a binding of the variables by elements of the database such that the constraints are satisfied. In the present paper, we loosen this concept in so far as we allow also answers that are partial; that is, not all variables in the query are bound by such an answer. Partial answers make it necessary to refine the model of query evaluation. The first modification relates to the satisfaction of constraints: in some circumstances we consider constraints involving unbound variables as satisfied. Second, in order to prevent a proliferation of answers, we only accept answers that are maximal in the sense that there are no assignments that bind more variables and satisfy the constraints of the query. Our model of query evaluation consists of two phases, a search phase and a filter phase. Semistructured databases are essentially labeled directed graphs. In the search phase, we use a query graph containing variables to match a maximal portion of the database graph. We investigate three different semantics for query graphs, which give rise to three variants of matching. For each variant, we provide algorithms and complexity results. In the filter phase, the maximal matchings resulting from the search phase are subjected to constraints, which may be weak or strong. Strong constraints require all their variables to be bound, while weak constraints do not. We describe a polynomial algorithm for evaluating a special type of queries with filter constraints, and assess the complexity of evaluating other queries for several kinds of constraints. In the final part, we investigate the containment problem for queries consisting only of search constraints under the different semantics.  相似文献   

11.
MiniCon: A scalable algorithm for answering queries using views   总被引:5,自引:0,他引:5  
The problem of answering queries using views is to find efficient methods of answering a query using a set of previously materialized views over the database, rather than accessing the database relations. The problem has received significant attention because of its relevance to a wide variety of data management problems, such as data integration, query optimization, and the maintenance of physical data independence. To date, the performance of proposed algorithms has received very little attention, and in particular, their scale up in the presence of a large number of views is unknown. We first analyze two previous algorithms, the bucket algorithm and the inverse-rules, and show their deficiencies. We then describe the MiniCon, a novel algorithm for finding the maximally-contained rewriting of a conjunctive query using a set of conjunctive views. We present the first experimental study of algorithms for answering queries using views. The study shows that the MiniCon scales up well and significantly outperforms the previous algorithms. We describe an extension of the MiniCon to handle comparison predicates, and show its performance experimentally. Finally, we describe how the MiniCon can be extended to the context of query optimization. Received: 15 October 2000 / Accepted: 15 April 2001 Published online: 28 June 2001  相似文献   

12.
13.
This work studies the quantum query complexity of Boolean functions in an unbounded-error scenario where it is only required that the query algorithm succeeds with a probability strictly greater than 1/2. We show that, just as in the communication complexity model, the unbounded-error quantum query complexity is exactly half of its classical counterpart for any (partial or total) Boolean function. Moreover, connecting the query and communication complexity results, we show that the “black-box” approach to convert quantum query algorithms into communication protocols by Buhrman-Cleve—Wigderson [STOC’98] is optimal even in the unbounded-error setting.We also study a related setting, called the weakly unbounded-error setting, where the cost of a query algorithm is given by q+log(1/2(p−1/2)), where q is the number of queries made and p>1/2 is the success probability of the algorithm. In contrast to the case of communication complexity, we show a tight multiplicative Θ(logn) separation between quantum and classical query complexity in this setting for a partial Boolean function. The asymptotic equivalence between them is also shown for some well-studied total Boolean functions.  相似文献   

14.
To populate a data warehouse specifically designed for Web data, i.e. web warehouse, it is imperative to harness relevant documents from the Web. In this paper, we describe a query mechanism called coupling query to glean relevant Web data in the context of our web warehousing system called Warehouse Of Web Data (WHOWEDA). Coupling query may be used for querying both HTML and XML documents. Some of the important features of our query mechanism are ability to query metadata, content, internal and external (hyperlink) structure of Web documents based on partial knowledge, ability to express constraints on tag attributes and tagless segment of data, ability to express conjunctive as well as disjunctive query conditions compactly, ability to control execution of a web query and preservation of the topological structure of hyperlinked documents in the query results. We also discuss how to formulate query graphically and in textual form using coupling graph and coupling text, respectively.  相似文献   

15.
Hypertree width is a measure of the degree of cyclicity of hypergraphs. A number of relevant problems from different areas, e.g., the evaluation of conjunctive queries in database theory or the constraint satisfaction in AI, are tractable when their underlying hypergraphs have bounded hypertree width. However, in practical contexts like the evaluation of database queries, we have more information besides the structure of queries. For instance, we know the number of tuples in relations, the selectivity of attributes and so on. In fact, all commercial query-optimizers are based on quantitative methods and do not care on structural properties.In this paper, in order to combine structural decomposition methods with quantitative approaches, the notion of weighted hypertree decomposition is defined. Weighted hypertree decompositions are equipped with cost functions, that can be used for modeling many situations where there is further information on the given problem, besides its hypergraph representation. The complexity of computing hypertree decompositions having the smallest weights, called minimal hypertree decompositions, is analyzed. It is shown that in many cases tractability is lost if weights are added. However, it is proven that, under some—not very severe—restrictions on the allowed cost functions and on the target hypertrees, optimal weighted hypertree decompositions can be computed in polynomial time. For some easier hypertree weighting functions, this problem is also highly parallelizable. Then, a cost function modeling query evaluation costs is provided, and it is shown how to exploit weighted hypertree decompositions for determining (logical) query plans for answering conjunctive queries. Finally, some preliminary results of an experimental comparison of this query optimization technique with the query optimizer of a commercial DBMS are presented.  相似文献   

16.
Neighborhood and associative query answering   总被引:5,自引:0,他引:5  
Cooperative query answering extends the classical notion of query answering to provide neighborhood and associated information. Neighborhood query answering relaxes the query and its answer via abstract representations. To integrate the abstraction view with the subsumption (is-a) and composition (part-of) views of type hierarchy, the notion of type abstraction hierarchy is introduced. To evaluate and control query relaxation, a nearness measure mechanism is provided. Associative query answering provides information conceptually related to, but not explicitly asked by the query. As object association is context sensitive, a DB-Pattern-KB framework is developed that couples domain-specific knowledge and participating objects in localized problem domains via virtual database patterns. Associative query answering can then be accomplished through tracing the behavior dependencies among cooperating objects in those problem domains. Such a framework allows related databases and knowledge bases to be linked dynamically in various contexts yet be maintained relatively independent of each other. The proposed approach has been implemented in the cooperative database system tested, CoBase, at UCLA. Our experience reveals that the proposed techniques are effective for cooperative query answering.This research is supported by DARPA contract N00174-91-C-0107.  相似文献   

17.
18.
We consider the following problem: Given an unsorted array of n elements, and a sequence of intervals in the array, compute the median in each of the subarrays defined by the intervals. We describe a simple algorithm which needs O(nlogk+klogn) time to answer k such median queries. This improves previous algorithms by a logarithmic factor and matches a comparison lower bound for k=O(n). The space complexity of our simple algorithm is O(nlogn) in the pointer machine model, and O(n) in the RAM model. In the latter model, a more involved O(n) space data structure can be constructed in O(nlogn) time where the time per query is reduced to O(logn/loglogn). We also give efficient dynamic variants of both data structures, achieving O(log2n) query time using O(nlogn) space in the comparison model and O((logn/loglogn)2) query time using O(nlogn/loglogn) space in the RAM model, and show that in the cell-probe model, any data structure which supports updates in O(logO(1)n) time must have Ω(logn/loglogn) query time.Our approach naturally generalizes to higher-dimensional range median problems, where element positions and query ranges are multidimensional—it reduces a range median query to a logarithmic number of range counting queries.  相似文献   

19.
20.
Finding typical instances is an effective approach to understand and analyze large data sets. In this paper, we apply the idea of typicality analysis from psychology and cognitive science to database query answering, and study the novel problem of answering top-k typicality queries. We model typicality in large data sets systematically. Three types of top-k typicality queries are formulated. To answer questions like “Who are the top-k most typical NBA players?”, the measure of simple typicality is developed. To answer questions like “Who are the top-k most typical guards distinguishing guards from other players?”, the notion of discriminative typicality is proposed. Moreover, to answer questions like “Who are the best k typical guards in whole representing different types of guards?”, the notion of representative typicality is used. Computing the exact answer to a top-k typicality query requires quadratic time which is often too costly for online query answering on large databases. We develop a series of approximation methods for various situations: (1) the randomized tournament algorithm has linear complexity though it does not provide a theoretical guarantee on the quality of the answers; (2) the direct local typicality approximation using VP-trees provides an approximation quality guarantee; (3) a local typicality tree data structure can be exploited to index a large set of objects. Then, typicality queries can be answered efficiently with quality guarantees by a tournament method based on a Local Typicality Tree. An extensive performance study using two real data sets and a series of synthetic data sets clearly shows that top-k typicality queries are meaningful and our methods are practical.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号