期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

First-order query rewriting for inconsistent databases

《Journal of Computer and System Sciences》2007,73(4):610-635

We consider the problem of retrieving consistent answers over databases that might be inconsistent with respect to a set of integrity constraints. In particular, we concentrate on sets of constraints that consist of key dependencies, and we give an algorithm that computes the consistent answers for a large and practical class of conjunctive queries. Given a query q, the algorithm returns a first-order query Q (called a query rewriting) such that for every (potentially inconsistent) database I, the consistent answers for q can be obtained by evaluating Q directly on I. 相似文献

2.

Supporting early pruning in top-k query processing on massive data

Xixian Han Jianzhong Li 《Information Processing Letters》2011,111(11):524-532

This paper analyzes the execution behavior of “No Random Accesses” (NRA) and determines the depths to which each sorted file is scanned in growing phase and shrinking phase of NRA respectively. The analysis shows that NRA needs to maintain a large quantity of candidate tuples in growing phase on massive data. Based on the analysis, this paper proposes a novel top-k algorithm Top-K with Early Pruning (TKEP) which performs early pruning in growing phase. General rule and mathematical analysis for early pruning are presented in this paper. The theoretical analysis shows that early pruning can prune most of the candidate tuples. Although TKEP is an approximate method to obtain the top-k result, the probability for correctness is extremely high. Extensive experiments show that TKEP has a significant advantage over NRA. 相似文献

3.

A three-valued semantics for querying and repairing inconsistent databases

Filippo Furfaro Sergio Greco Cristian Molinaro 《Annals of Mathematics and Artificial Intelligence》2007,51(2-4):167-193

The problem of managing and querying inconsistent databases has been deeply investigated in the last few years. As the problem of consistent query answering is hard in the general case, most of the techniques proposed so far have an exponential complexity. Polynomial techniques have been proposed only for restricted forms of constraints (such as functional dependencies) and queries. In this paper, a technique for computing “approximate” consistent answers in polynomial time is proposed, which works in the presence of a wide class of constraints (namely, full constraints) and Datalog queries. The proposed approach is based on a repairing strategy where update operations assigning an undefined truth value to the “reliability” of tuples are allowed, along with updates inserting or deleting tuples. The result of a repair can be viewed as a three-valued database which satisfies the specified constraints. In this regard, a new semantics (namely, partial semantics) is introduced for constraint satisfaction in the context of three-valued databases, which aims at capturing the intuitive meaning of constraints under three-valued logic. It is shown that, in order to compute “approximate” consistent query answers, it suffices to evaluate queries by taking into account a unique repair (called deterministic repair), which in some sense “summarizes” all the possible repairs. The so obtained answers are “approximate” in the sense that are safe (true and false atoms in the answers are, respectively, true and false under the classical two-valued semantics), but not complete. 相似文献

4.

Determinacy and query rewriting for conjunctive queries and views

Foto N. Afrati 《Theoretical computer science》2011,412(11):1005-1021

Answering queries using views is the problem which examines how to derive the answers to a query when we only have the answers to a set of views. Constructing rewritings is a widely studied technique to derive those answers. In this paper we consider the problem of the existence of rewritings in the case where the answers to the views uniquely determine the answers to the query. Specifically, we say that a view set Vdetermines a query Q if for any two databases D₁,D₂ it holds: V(D₁)=V(D₂) implies Q(D₁)=Q(D₂). We consider the case where query and views are defined by conjunctive queries and investigate the question: If a view set V determines a query Q, is there an equivalent rewriting of Q using V? We present here interesting cases where there are such rewritings in the language of conjunctive queries. Interestingly, we identify a class of conjunctive queries, CQ_path, for which a view set can produce equivalent rewritings for “almost all” queries which are determined by this view set. We introduce a problem which relates determinacy to query equivalence. We show that there are cases where restricted results can carry over to broader classes of queries. 相似文献

5.

Embedding hamiltonian paths in hypercubes with a required vertex in a fixed position

Chung-Meng Lee Lih-Hsing Hsu 《Information Processing Letters》2008,107(5):171-176

Assume that n is a positive integer with n?2. It is proved that between any two different vertices x and y of Qn there exists a path Pl(x,y) of length l for any l with h(x,y)?l?n²−1 and 2|(l−h(x,y)). We expect such path Pl(x,y) can be further extended by including the vertices not in Pl(x,y) into a hamiltonian path from x to a fixed vertex z or a hamiltonian cycle. In this paper, we prove that for any two vertices x and z from different partite set of n-dimensional hypercube Qn, for any vertex y∈V(Qn)−{x,z}, and for any integer l with h(x,y)?l?n²−1−h(y,z) and 2|(l−h(x,y)), there exists a hamiltonian path R(x,y,z;l) from x to z such that dR(x,y,z;l)(x,y)=l. Moreover, for any two distinct vertices x and y of Qn and for any integer l with h(x,y)?l?²n−1 and 2|(l−h(x,y)), there exists a hamiltonian cycle S(x,y;l) such that dS(x,y;l)(x,y)=l. 相似文献

6.

Knowledge Compilation Meets Database Theory: Compiling Queries to Decision Diagrams

Abhay?Jha Email author Dan?Suciu 《Theory of Computing Systems》2013,52(3):403-440

The goal of Knowledge Compilation is to represent a Boolean expression in a format in which it can answer a range of “online-queries” in PTIME. The online-query of main interest to us is model counting, because of its application to query evaluation on probabilistic databases, but other online-queries can be supported as well such as testing for equivalence, testing for implication, etc. In this paper we study the following problem: given a database query q, decide whether its lineage can be compiled efficiently into a given target language. We consider four target languages, of strictly increasing expressive power (when the size of compilation is restricted to be polynomial in the data size): read-once Boolean formulae, OBDD, FBDD and d-DNNF. For each target, we study the class of database queries that admit polynomial size representation: these queries can also be evaluated in PTIME over probabilistic databases. When queries are restricted to conjunctive queries without self-joins, it was known that these four classes collapse to the class of hierarchical queries, which is also the class of PTIME queries over probabilistic databases. Our main result in this paper is that, in the case of Unions of Conjunctive Queries (UCQ), these classes form a strict hierarchy. Thus, unlike conjunctive queries without self-joins, the expressive power of UCQ differs considerably with respect to these target compilation languages. Moreover, we give a complete characterization of the first two target languages, based on the query’s syntax. 相似文献

7.

New approximations for minimum-weighted dominating sets and minimum-weighted connected dominating sets on unit disk graphs

Feng Zou Yuexuan Wang Hongwei Du 《Theoretical computer science》2011,412(3):198-208

Given a node-weighted graph, the minimum-weighted dominating set (MWDS) problem is to find a minimum-weighted vertex subset such that, for any vertex, it is contained in this subset or it has a neighbor contained in this set. And the minimum-weighted connected dominating set (MWCDS) problem is to find a MWDS such that the graph induced by this subset is connected. In this paper, we study these two problems on a unit disk graph. A (4 +ε)-approximation algorithm for an MWDS based on a dynamic programming algorithm for a Min-Weight Chromatic Disk Cover is presented. Meanwhile, we also propose a (1 +ε)-approximation algorithm for the connecting part by showing a polynomial-time approximation scheme for a Node-Weighted Steiner Tree problem when the given terminal set is c-local and thus obtain a (5 +ε)-approximation algorithm for an MWCDS. 相似文献

8.

Scalable keyword search on large data streams

Lu Qin Jeffrey Xu Yu Lijun Chang 《The VLDB Journal The International Journal on Very Large Data Bases》2011,20(1):35-57

It is widely recognized that the integration of information retrieval (IR) and database (DB) techniques provides users with a broad range of high quality services. Along this direction, IR-styled m-keyword query processing over a relational database in an rdbms framework has been well studied. It finds all hidden interconnected tuple structures, for example connected trees that contain keywords and are interconnected by sequences of primary/foreign key relationships among tuples. A new challenging issue is how to monitor events that are implicitly interrelated over an open-ended relational data stream for a user-given m-keyword query. Such a relational data stream is a sequence of tuple insertion/deletion operations. The difficulty of the problem is related to the number of costly joins to be processed over time when tuples are inserted and/or deleted. Such cost is mainly affected by three parameters, namely, the number of keywords, the maximum size of interconnected tuple structures, and the complexity of the database schema when it is viewed as a schema graph. In this paper, we propose new approaches. First, we propose a novel algorithm to efficiently determine all the joins that need to be processed for answering an m-keyword query. Second, we propose a new demand-driven approach to process such a query over a high speed relational data stream. We show that we can achieve high efficiency by significantly reducing the number of intermediate results when processing joins over a relational data stream. The proposed new techniques allow us to achieve high scalability in terms of both query plan generation and query plan execution. We conducted extensive experimental studies using synthetic data and real data to simulate a relational data stream. Our approach significantly outperforms existing algorithms. 相似文献

9.

Semantic-distance based evaluation of ranking queries over relational databases 总被引：1，自引：0，他引：1

Liang Zhu Qin Ma Chunnian Liu Guojun Mao Wenzhu Yang 《Journal of Intelligent Information Systems》2010,35(3):415-445

Traditional database search uses pattern match in the comparison process. For a query with some search words, tuples are selected only if the words of the tuples exactly match the query words. In this paper, we propose a new method for evaluating relational ranking queries (or top-N queries) with text attributes. This method defines semantic distance functions and utilizes semantic match between words in database search. The attempt is that tuples, not only exactly matching, but also close to the query according to semantic distances, can both be fetched. The basic idea of the method is to create an index based on WordNet to expand the tuple words semantically. The candidate results for a query are retrieved by the index and a simple SQL selection statement, and then top-N answers are obtained. Extensive experiments are carried out to measure the performance of this new strategy for the evaluation of ranking queries over relational databases. 相似文献

10.

Testing bag-containment of conjunctive queries

Nieves R. Brisaboa Héctor J. Hernández 《Acta Informatica》1997,34(7):557-578

Under the bag-theoretic semantics relations are bags of tuples, that is, a tuple may have any number of duplicates. Under this semantics, a conjunctive query is bag-contained in a conjunctive query , denoted , if for all databases , , the result of applying to , is a subbag of . It is not known whether testing is decidable. In this paper we prove that can be tested on a finite set of canonical databases built from the body of . Using that result we give a procedure that decides the bag-containment problem of conjunctive queries in a large number of cases. Received: 27 September 1995 / 19 June 1996 相似文献

11.

On the finite controllability of conjunctive query answering in databases under open-world assumption

Riccardo Rosati 《Journal of Computer and System Sciences》2011,77(3):572-594

In this paper we study queries over relational databases with integrity constraints (ICs). The main problem we analyze is OWA query answering, i.e., query answering over a database with ICs under open-world assumption. The kinds of ICs that we consider are inclusion dependencies and functional dependencies, in particular key dependencies; the query languages we consider are conjunctive queries and unions of conjunctive queries. We present results about the decidability of OWA query answering under ICs. In particular, we study OWA query answering both over finite databases and over unrestricted databases, and identify the cases in which such a problem is finitely controllable, i.e., when OWA query answering over finite databases coincides with OWA query answering over unrestricted databases. Moreover, we are able to easily turn the above results into new results about implication of ICs and query containment under ICs, due to the deep relationship between OWA query answering and these two classical problems in database theory. In particular, we close two long-standing open problems in query containment, since we prove finite controllability of containment of conjunctive queries both under arbitrary inclusion dependencies and under key and foreign key dependencies. The results of our investigation are very relevant in many research areas which have recently dealt with databases under an incomplete information assumption: e.g., data integration, data exchange, view-based information access, ontology-based information systems, and peer data management systems. 相似文献

12.

Using a relational database for scalable XML search

Rebecca J. Cathey Steven M. Beitzel Eric C. Jensen David Grossman Ophir Frieder 《The Journal of supercomputing》2008,44(2):146-178

XML is a flexible and powerful tool that enables information and security sharing in heterogeneous environments. Scalable technologies are needed to effectively manage the growing volumes of XML data. A wide variety of methods exist for storing and searching XML data; the two most common techniques are conventional tree-based and relational approaches. Tree-based approaches represent XML as a tree and use indexes and path join algorithms to process queries. In contrast, the relational approach utilizes the power of a mature relational database to store and search XML. This method relationally maps XML queries to SQL and reconstructs the XML from the database results. To date, the limited acceptance of the relational approach to XML processing is due to the need to redesign the relational schema each time a new XML hierarchy is defined. We, in contrast, describe a relational approach that is fixed schema eliminating the need for schema redesign at the expense of potentially longer runtimes. We show, however, that these potentially longer runtimes are still significantly shorter than those of the tree approach. We use a popular XML benchmark to compare the scalability of both approaches. We generated large collections of heterogeneous XML documents ranging in size from 500 MB to 8 GB using the XBench benchmark. The scalability of each method was measured by running XML queries that cover a wide range of XML search features on each collection. We measure the scalability of each method over different query features as the collection size increases. In addition, we examine the performance of each method as the result size and the number of predicates increase. Our results show that our relational approach provides a scalable approach to XML retrieval by leveraging existing relational database optimizations. Furthermore, we show that the relational approach typically outperforms the tree-based approach while scaling consistently over all collections studied.

Ophir Frieder (Corresponding author)Email:

相似文献

13.

Nonlinear mappings in problem solving and their PSO-based development

Adam Pedrycz Fangyan Dong 《Information Sciences》2011,181(19):4112-4123

The study is devoted to a concept and algorithmic realization of nonlinear mappings aimed at increasing the effectiveness of the problem solving method. Given the original input space X and a certain problem solving method M, designed is a nonlinear mapping ? so that the method operating in the transformed space M(?(X)) becomes more efficient. The nonlinear mappings realize a transformation of X through contractions and expansions of selected regions of the original space. In particular, we show how a piecewise linear mapping is optimized by using particle swarm optimization (PSO) and a suitable fitness function quantifying the objective of the problem. Several families of problems are investigated and illustrated through illustrative experimental results. 相似文献

14.

Probabilistic query answering over inconsistent databases

Sergio Greco Cristian Molinaro 《Annals of Mathematics and Artificial Intelligence》2012,64(2-3):185-207

This paper presents a framework for querying inconsistent databases in the presence of functional dependencies. Most of the works dealing with the problem of extracting reliable information from inconsistent databases are based on the notion of repair, a minimal set of tuple insertions and deletions which leads the database to a consistent state (called repaired database), and the notion of consistent query answer, a query answer that can be obtained from every repaired database. In this work, both the notion of repair and query answer differ from the original ones. In the presence of functional dependencies, tuple deletions are the only operations that are performed in order to restore the consistency of an inconsistent database. However, deleting a tuple to remove an integrity violation potentially eliminates useful information in that tuple. In order to cope with this problem, we adopt a notion of repair, based on tuple updates, which allows us to better preserve information in the source database. A drawback of the notion of consistent query answer is that it does not allow us to discriminate among non-consistent answers, namely answers which can be obtained from a non-empty proper subset of the repaired databases. To obtain more informative query answers, we propose the notion of probabilistic query answer, that is query answers are tuples associated with probabilities. This new semantics of query answering over inconsistent databases allows us to give a measure of uncertainty to query answers. We show that the problem of computing probabilistic query answers is FP ^#P-complete. We also propose a technique for computing probabilistic answers to arbitrary relational algebra queries. 相似文献

15.

The power of inequality semijoins

Philip A. Bernstein Nathan Goodman 《Information Systems》1981,6(4):255-265

Semijoin is a relational operator used in many relational query processing algorithms. Semijoins can be used to “reduce” the database by delimitting portions of the database that contain data relevant to a given query. For some queries, there exist sequences of semijoins that delimit the exact portions of the database needed to answer the query. Such sequences are called full reducers.

This paper considers a class of queries called natural inequality queries (NI queries), and characterizes a subclass for which full reducers exist. We also present an efficient algorithm that decides whether an NI query lies within this subclass, and constructs a full reducer for the query. The NI queries are a subset of the aggregate-free, conjunctive queries of QUEL, and permit join clauses to include <, , =, , >. 相似文献

16.

Prediction-hardness of acyclic conjunctive queries

《Theoretical computer science》2005,348(1):84-94

相似文献

17.

Semantic optimization techniques for preference queries

Jan Chomicki 《Information Systems》2007

Preference queries are relational algebra or SQL queries that contain occurrences of the winnow operator (find the most preferred tuples in a given relation). Such queries are parameterized by specific preference relations. Semantic optimization techniques make use of integrity constraints holding in the database. In the context of semantic optimization of preference queries, we identify two fundamental properties: containment of preference relations relative to integrity constraints and satisfaction of order axioms relative to integrity constraints. We show numerous applications of those notions to preference query evaluation and optimization. As integrity constraints, we consider constraint-generating dependencies, a class generalizing functional dependencies. We demonstrate that the problems of containment and satisfaction of order axioms can be captured as specific instances of constraint-generating dependency entailment. This makes it possible to formulate necessary and sufficient conditions for the applicability of our techniques as constraint validity problems. We characterize the computational complexity of such problems. 相似文献

18.

Weighted hypertree decompositions and optimal query plans

《Journal of Computer and System Sciences》2007,73(3):475-506

Hypertree width is a measure of the degree of cyclicity of hypergraphs. A number of relevant problems from different areas, e.g., the evaluation of conjunctive queries in database theory or the constraint satisfaction in AI, are tractable when their underlying hypergraphs have bounded hypertree width. However, in practical contexts like the evaluation of database queries, we have more information besides the structure of queries. For instance, we know the number of tuples in relations, the selectivity of attributes and so on. In fact, all commercial query-optimizers are based on quantitative methods and do not care on structural properties.In this paper, in order to combine structural decomposition methods with quantitative approaches, the notion of weighted hypertree decomposition is defined. Weighted hypertree decompositions are equipped with cost functions, that can be used for modeling many situations where there is further information on the given problem, besides its hypergraph representation. The complexity of computing hypertree decompositions having the smallest weights, called minimal hypertree decompositions, is analyzed. It is shown that in many cases tractability is lost if weights are added. However, it is proven that, under some—not very severe—restrictions on the allowed cost functions and on the target hypertrees, optimal weighted hypertree decompositions can be computed in polynomial time. For some easier hypertree weighting functions, this problem is also highly parallelizable. Then, a cost function modeling query evaluation costs is provided, and it is shown how to exploit weighted hypertree decompositions for determining (logical) query plans for answering conjunctive queries. Finally, some preliminary results of an experimental comparison of this query optimization technique with the query optimizer of a commercial DBMS are presented. 相似文献

19.

Restricted compositions and permutations: From old to new Gray codes

V. Vajnovszki R. Vernay 《Information Processing Letters》2011,111(13):650-655

Any Gray code for a set of combinatorial objects defines a total order relation on this set: x is less than y if and only if y occurs after x in the Gray code list. Let ? denote the order relation induced by the classical Gray code for the product set (the natural extension of the Binary Reflected Gray Code to k-ary tuples). The restriction of ? to the set of compositions and bounded compositions gives known Gray codes for those sets. Here we show that ? restricted to the set of bounded compositions of an interval yields still a Gray code. An n-composition of an interval is an n-tuple of integers whose sum lies between two integers; and the set of bounded n-compositions of an interval simultaneously generalizes product set and compositions of an integer, and so ? put under a single roof all these Gray codes.As a byproduct we obtain Gray codes for permutations with a number of inversions lying between two integers, and with even/odd number of inversions or cycles. Such particular classes of permutations are used to solve some computational difficult problems. 相似文献

20.

Mining frequent conjunctive queries in relational databases through dependency discovery

Bart Goethals Dominique Laurent Wim Le Page Cheikh Tidiane Dieng 《Knowledge and Information Systems》2012,33(3):655-684

We present an approach for mining frequent conjunctive in arbitrary relational databases. Our pattern class is the simple, but appealing subclass of simple conjunctive queries. Our algorithm, called Conqueror $^+$ , is capable of detecting previously unknown functional and inclusion dependencies that hold on the database relations as well as on joins of relations. These newly detected dependencies are then used to prune redundant queries. We propose an efficient database-oriented implementation of our algorithm using SQL and provide several promising experimental results. 相似文献