期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

View selection for <Emphasis Type="Italic">real</Emphasis> conjunctive queries

Foto Afrati Rada Chirkova Manolis Gergatsoulis Vassia Pavlaki 《Acta Informatica》2007,44(5):289-321

Given a query workload, a database and a set of constraints, the view-selection problem is to select views to materialize so that the constraints are satisfied and the views can be used to compute the queries in the workload efficiently. A typical constraint, which we consider in the present work, is to require that the views can be stored in a given amount of disk space. Depending on features of SQL queries (e.g., the DISTINCT keyword) and on whether the database relations on which the queries are applied are sets or bags, the queries may be computed under set semantics, bag-set semantics, or bag semantics. In this paper we study the complexity of the view-selection problem for conjunctive queries and views under these semantics. We show that bag semantics is the “easiest to handle” (we show that in this case the decision version of view selection is in NP), whereas under set and bag-set semantics we assume further restrictions on the query workload (we only allow queries without self-joins in the workload) to achieve the same complexity. Moreover, while under bag and bag-set semantics filtering views (i.e., subgoals that can be dropped from the rewriting without impacting equivalence to the query) are practically not needed, under set semantics filtering views can reduce significantly the query-evaluation costs. We show that under set semantics the decision version of the view-selection problem remains in NP only if filtering views are not allowed in the rewritings. Finally, we investigate whether the cgalg algorithm for view selection introduced in Chirkova and Genesereth (Linearly bounded reformulations of conjunctive databases, pp. 987–1001, 2000) is suitable in our setting. We prove that this algorithm is sound for all cases we examine here, and that it is complete under bag semantics for workloads of arbitrary conjunctive queries and under bag-set semantics for workloads of conjunctive queries without self-joins. Rada Chirkova’s work on this material has been supported by the National Science Foundation under Grant No. 0307072. The project is co-funded by the European Social Fund (75%) and National Resources (25%)- Operational Program for Educational and Vocational Training II (EPEAEK II) and particularly the program PYTHAGORAS. A preliminary version of this paper appears in F. Afrati, R. Chirkova, M. Gergatsoulis, V. Pavlaki. Designing Views to Efficiently Answer Real SQL Queries. In Proc. of SARA 2005, LNAI Vol. 3607, pages 332-346, Springer-Verlag, 2005. 相似文献

2.

Query containment under bag and bag-set semantics

Foto N. Afrati Manolis Gergatsoulis 《Information Processing Letters》2010,110(10):360-369

Conjunctive queries (CQs) are at the core of query languages encountered in many logic-based research fields such as AI, or database systems. The majority of existing work assumes set semantics but often in real applications the manipulation of duplicate tuples is required. One of the major problems that arises as part of advanced features of query optimization, data integration, query reformulation and many other research topics is testing for containment of such queries. In this work, we investigate the complexity of query containment problem for CQs under bag semantics (i.e. duplicate tuples are allowed in both the database and the results of queries) and under bag-set semantics (i.e. duplicates are allowed in the result of the queries but not in the database). We derive complexity results for these problems for five major subclasses of CQs; and we also find necessary conditions for CQ query containment. The general case of these problems remains open. 相似文献

3.

On solving efficiently the view selection problem under bag and bag-set semantics

《Information Systems》2014

In this paper, we investigate the problem of view selection for workloads of conjunctive queries under bag and bag-set semantics. In particular, for both semantics we aim to limit the search space of candidate viewsets. We also start delineating boundaries between query workloads for which certain even more restricted search spaces suffice. They suffice in the sense that they do not compromise optimality in that they contain at least one of the optimal solutions. We start with the general case for both bag and bag-set semantics, where we give a tight condition that candidate views can satisfy and still the search space (thus limited) does contain at least one optimal solution. We show that these results, for both semantics, reduce the size of the search space significantly. Further on, due to this analysis for both semantics, a delineation of the space of viewsets and the space of the corresponding equivalent rewritings for a certain conjunctive query workload is given. We show that for chain query workloads under both bag and bag-set semantics, taking only chain views may miss optimal solutions, whereas, if we further limit the queries to be path-queries (i.e., chain queries over a single binary relation), then, under bag semantics, path-views suffice. Concentrating to bag-set semantics, we show that the path-viewsets do not suffice for every path-query workload. 相似文献

4.

Semantics preserving SPARQL-to-SQL translation 总被引：2，自引：0，他引：2

Artem Shiyong Farshad 《Data & Knowledge Engineering》2009,68(10):973-1000

Most existing RDF stores, which serve as metadata repositories on the Semantic Web, use an RDBMS as a backend to manage RDF data. This motivates us to study the problem of translating SPARQL queries into equivalent SQL queries, which further can be optimized and evaluated by the relational query engine and their results can be returned as SPARQL query solutions. The main contributions of our research are: (i) We formalize a relational algebra based semantics of SPARQL, which bridges the gap between SPARQL and SQL query languages, and prove that our semantics is equivalent to the mapping-based semantics of SPARQL; (ii) Based on this semantics, we propose the first provably semantics preserving SPARQL-to-SQL translation for SPARQL triple patterns, basic graph patterns, optional graph patterns, alternative graph patterns, and value constraints; (iii) Our translation algorithm is generic and can be directly applied to existing RDBMS-based RDF stores; and (iv) We outline a number of simplifications for the SPARQL-to-SQL translation to generate simpler and more efficient SQL queries and extend our defined semantics and translation to support the bag semantics of a SPARQL query solution. The experimental study showed that our proposed generic translation can serve as a good alternative to existing schema dependent translations in terms of efficient query evaluation and/or ensured query result correctness. 相似文献

5.

Containment of Conjunctive Queries on Annotated Relations

Todd J. Green 《Theory of Computing Systems》2011,49(2):429-459

We study containment and equivalence of (unions of) conjunctive queries on relations annotated with elements of a commutative semiring. Such relations and the semantics of positive relational queries on them were introduced in a recent paper as a generalization of set semantics, bag semantics, incomplete databases, and databases annotated with various kinds of provenance information. We obtain positive decidability results and complexity characterizations for databases with lineage, why-provenance, and provenance polynomial annotations, for both conjunctive queries and unions of conjunctive queries. At least one of these results is surprising given that provenance polynomial annotations seem “more expressive” than bag semantics and under the latter, containment of unions of conjunctive queries is known to be undecidable. The decision procedures rely on interesting variations on the notion of containment mappings. We also show that for any positive semiring (a very large class) and conjunctive queries without self-joins, equivalence is the same as isomorphism. 相似文献

6.

Testing bag-containment of conjunctive queries

Nieves R. Brisaboa Héctor J. Hernández 《Acta Informatica》1997,34(7):557-578

Under the bag-theoretic semantics relations are bags of tuples, that is, a tuple may have any number of duplicates. Under this semantics, a conjunctive query is bag-contained in a conjunctive query , denoted , if for all databases , , the result of applying to , is a subbag of . It is not known whether testing is decidable. In this paper we prove that can be tested on a finite set of canonical databases built from the body of . Using that result we give a procedure that decides the bag-containment problem of conjunctive queries in a large number of cases. Received: 27 September 1995 / 19 June 1996 相似文献

7.

Enriching the conceptual basis for query formulation through relationship semantics in databases

《Information Systems》2001,26(6):445-475

The rapid increase in end-user computing calls into question the suitability of existing database query languages (DBQLs). Because the typical DB end-user is not a DB specialist, it is essential that DBQLs use concepts that are as close as possible to those in the end-users’ cognitive mental model and adopt interface techniques that are suited to end-users’ abilities. Concept-based query languages are well suited for this. This realization has motivated further research in conceptual, or semantic, query approaches. However, the primary focus in this field has been on semantic query optimization, not on query formulation. In this study, we address ourselves to the problem of formulation of queries using concepts. We propose a concept-based query language, called the conceptual query language (CQL), which allows for the conceptual abstraction of database queries and exploits the rich semantics of data models to ease and facilitate query formulation.The CQL approach uses the relationship semantics of semantic data models to render transparent the technical complexities of existing DB query languages. Association semantics are also used to automatically construct query graphs and pseudo-natural language explanations of queries, and to generate SQL codes. A set theoretic formalism for conceptual queries is developed and used. This paper discusses the design of CQL, its expressive power, its implementation, and the strategies for CQL query processing. The implementation of a CQL prototype is briefly discussed in this paper. User experiments were carried out extensively and showed the advantage of CQL over alternative languages such as SQL. 相似文献

8.

SQL语言的形式语义 总被引：2，自引：0，他引：2

李旭帅毛宇光《微机发展》2005,15(3):14-16

对SQL查询的形式语义的研究有助于形式地证明两条SQL语句是否等价,从而消除了自然语言的二义性。SQL标准对SQL的语义规则进行了定义,但是并没有很好地处理不完全信息问题。文中以中介逻辑谓词演算系统MFM为基础,构造一个形式的三值谓词演算模型EPMC,然后通过语法转化规则把SQL查询转化为EPMC,从而完整地定义了SQL查询的形式语义。相似文献

9.

数据库语义合取查询研究

下载免费PDF全文

屈振新唐胜群《计算机工程》2010,36(11):52-54

针对异质、异构数据库的语义集成中,对海量元组进行语义查询时因效率问题而无法使用丰富的语义表达能力的问题,提出一种兼顾速度和语义表达能力的算法,将语义查询和本体都进行图形化表示,并实现子图的语义匹配,将匹配的结果转化成数据库查询语句。与将语义查询直接重写为SQL的算法相比,该算法能支持更丰富的语义。相似文献

10.

The CQL continuous query language: semantic foundations and query execution 总被引：2，自引：0，他引：2

Arvind Arasu Shivnath Babu Jennifer Widom 《The VLDB Journal The International Journal on Very Large Data Bases》2006,15(2):121-142

CQL, a continuous query language, is supported by the STREAM prototype data stream management system (DSMS) at Stanford. CQL is an expressive SQL-based declarative language for registering continuous queries against streams and stored relations. We begin by presenting an abstract semantics that relies only on “black-box” mappings among streams and relations. From these mappings we define a precise and general interpretation for continuous queries. CQL is an instantiation of our abstract semantics using SQL to map from relations to relations, window specifications derived from SQL-99 to map from streams to relations, and three new operators to map from relations to streams. Most of the CQL language is operational in the STREAM system. We present the structure of CQL's query execution plans as well as details of the most important components: operators, interoperator queues, synopses, and sharing of components among multiple operators and queries. Examples throughout the paper are drawn from the Linear Road benchmark recently proposed for DSMSs. We also curate a public repository of data stream applications that includes a wide variety of queries expressed in CQL. The relative ease of capturing these applications in CQL is one indicator that the language contains an appropriate set of constructs for data stream processing. Edited by M. Franklin 相似文献

11.

A foundation for conventional and temporal query optimizationaddressing duplicates and ordering

Slivinskas G. Jensen C.S. Snodgrass R.T. 《Knowledge and Data Engineering, IEEE Transactions on》2001,13(1):21-49

Most real-world databases contain substantial amounts of time-referenced, or temporal, data. Recent advances in temporal query languages show that such database applications may benefit substantially from built-in temporal support in the DBMS. To achieve this, temporal query representation, optimization, and processing mechanisms must be provided. This paper presents a foundation for query optimization that integrates conventional and temporal query optimization and is suitable for both conventional DBMS architectures and ones where the temporal support is obtained via a layer on top of a conventional DBMS. This foundation captures duplicates and ordering for all queries, as well as coalescing for temporal queries, thus generalizing all existing approaches known to the authors. It includes a temporally extended relational algebra to which SQL and temporal SQL queries may be mapped, six types of algebraic equivalences, concrete query transformation rules that obey different equivalences, a procedure for determining which types of transformation rules are applicable for optimizing a query, and a query plan enumeration algorithm 相似文献

12.

A general procedure to check conjunctive query containment

Miguel R. Penabad Nieves R. Brisaboa Héctor J. Hernández José R. Paramá 《Acta Informatica》2002,38(7):489-529

In this paper, we present a general procedure to test conjunctive query containment. We divide the containment problem into four categories, taking into account the underlying semantics (set or bag theoretic) and the presence or absence of built-in predicates in the queries. After a brief review of previous work on conjunctive query containment, we present a new procedure, called QCC (Query Containment Checker), which we show to be a general and uniform procedure to check the containment among conjunctive queries under the four categories mentioned above. We briefly describe the use of QCC to check bag containment of conjunctive queries, and explain in detail how to use QCC to check set containment of conjunctive queries with built-in predicates. In our conclusions, we point out some uses of QCC for other types of containment. Received: 21 January 2000 / 19 November 2001 相似文献

13.

Full predicate coverage for testing SQL database queries

Javier Tuya María Jos Surez‐Cabal Claudio de la Riva 《Software Testing, Verification and Reliability》2010,20(3):237-288

In the field of database applications a considerable part of the business logic is implemented using a semi‐declarative language: the Structured Query Language (SQL). Because of the different semantics of SQL compared with other procedural languages, the conventional coverage criteria for testing are not directly applicable. This paper presents a criterion specifically tailored for SQL queries (SQLFpc). It is based on Masking Modified Condition Decision Coverage (MCDC) or Full Predicate Coverage and takes into account a wide range of the syntax and semantics of SQL, including selection, joining, grouping, aggregations, subqueries, case expressions and null values. The criterion assesses the coverage of the test data in relation to the query that is executed and it is expressed as a set of rules that are automatically generated and efficiently evaluated against a test database. The use of the criterion is illustrated in a case study, which includes complex queries. Copyright © 2010 John Wiley & Sons, Ltd. 相似文献

14.

Answering Queries Using Limited External Query Processors

《Journal of Computer and System Sciences》1999,58(1):69-82

When answering queries using external information sources, the contents of the queries can be described by views. To answer a query, we must rewrite it using the set of views presented by the sources. When the external information sources also have the ability to answer some (perhaps limited) sets of queries that require performing operations on their data, the set of views presented by the source may be infinite (albeit encoded in some finite fashion). Previous work on answering queries using views has only considered the case where the set of views is finite. In order to exploit the ability of information sources to answer more complex queries, we consider the problem of answering conjunctive queries using infinite sets of conjunctive views. Our first result is that an infinite set of conjunctive views can be partitioned into a finite number of equivalence classes, such that picking one view from every nonempty class is sufficient to determine whether the query can be answered using the views. Second, we show how to compute the set of equivalence classes for sets of conjunctive views encoded by a datalog program. Furthermore, we extend our results to the case when the query and the views use the built-in predicates <, ⩽, =, and ≠, and they are interpreted over a dense domain. Finally, we extend our results to conjunctive queries and views with the built-in predicates <, ⩽, and = interpreted over the integers. In doing so we present a result of independent interest, namely, an algorithm to minimize such queries. 相似文献

15.

Mining for Attribute Definitions in a Distributed Two-Layered DB System

Zbigniew W. Ras Jan M. Żytkow 《Journal of Intelligent Information Systems》2000,14(2-3):115-130

Empirical equations are an important class of regularities that can be discovered in databases. We concentrate on the role of equations as definitions of attribute values. Such definitions can be used in many ways in a single database and for transfer of knowledge between databases. We present a quest for equations that can be used as definitions of an attribute in a given database. That quest triggers a discovery mechanism that specializes in searching recursively a system of databases and returns a set of partial definitions. We introduce the notion of shared operational semantics. It is founded on an equation-based system of partial definitions and it gives necessary foundations for designing local query answering systems in a distributed two-layered information system (D²LIS). The knowledge exchange between two sites of D²LIS may only improve an equation-based system of partial definitions at each of these sites. At the same time the shared operational semantics will better interpret user queries. Operational semantics augments the earlier developed semantics for rules used as attribute definitions. To put the shared operational semantics on a firm theoretical foundation we give a formal interpretation of queries which justifies empirical equations in their definitional role. 相似文献

16.

Efficient query evaluation on probabilistic databases

Nilesh Dalvi Dan Suciu 《The VLDB Journal The International Journal on Very Large Data Bases》2007,16(4):523-544

We describe a framework for supporting arbitrarily complex SQL queries with “uncertain” predicates. The query semantics is based on a probabilistic model and the results are ranked, much like in Information Retrieval. Our main focus is query evaluation. We describe an optimization algorithm that can compute efficiently most queries. We show, however, that the data complexity of some queries is #P-complete, which implies that these queries do not admit any efficient evaluation methods. For these queries we describe both an approximation algorithm and a Monte-Carlo simulation algorithm. 相似文献

17.

Equivalence of formal semantics definition methods

M. J. A. Caswell 《Formal Aspects of Computing》1997,9(1):68-77

There are numerous methods of formally defining the semantics of computer languages. Each method has been designed to fulfil a different purpose. For example, some have been designed to make reasoning about languages as easy as possible; others have been designed to be accessible to a large audience and some have been designed to ease implementation of languages. Given two semantics definitions of a language written using two separate semantics definition methods, we must be able to show that the two are in fact equivalent. If we cannot do this then we either have an error in one of the semantics definitions, or more seriously we have a problem with the semantics definition methods themselves.Three methods of defining the semantics of computer languages have been considered, i.e. Denotational Semantics, Structural Operational Semantics and Action Semantics. An equivalence between these three is shown for a specific example language by first defining its semantics using each of the three definition methods. The proof of the equivalence is then constructed by selecting pairs of the semantics definitions and showing that they define the same language.A full version of this paper can be accessed via our web page http://www.cs.man.ac.uk/fmethods/ facj.html 相似文献

18.

A parametric approach to deductive databases with uncertainty

Lakshmanan L.V.S. Shiri N. 《Knowledge and Data Engineering, IEEE Transactions on》2001,13(4):554-570

Numerous frameworks have been proposed in recent years for deductive databases with uncertainty. On the basis of how uncertainty is associated with the facts and rules in a program, we classify these frameworks into implication-based (IB) and annotation-based (AB) frameworks. We take the IB approach and propose a generic framework, called the parametric framework, as a unifying umbrella for IB frameworks. We develop the declarative, fixpoint, and proof-theoretic semantics of programs in our framework and show their equivalence. Using the framework as a basis, we then study the query optimization problem of containment of conjunctive queries in this framework and establish necessary and sufficient conditions for containment for several classes of parametric conjunctive queries. Our results yield tools for use in the query optimization for large classes of query programs in IB deductive databases with uncertainty 相似文献

19.

Answering queries using materialized views with minimum size

Rada Chirkova Chen Li Jia Li 《The VLDB Journal The International Journal on Very Large Data Bases》2006,15(3):191-210

In this paper, we study the following problem. Given a database and a set of queries, we want to find a set of views that can compute the answers to the queries, such that the amount of space, in bytes, required to store the viewset is minimum on the given database. (We also handle problem instances where the input has a set of database instances, as described by an oracle that returns the sizes of view relations for given view definitions.) This problem is important for applications such as distributed databases, data warehousing, and data integration. We explore the decidability and complexity of the problem for workloads of conjunctive queries. We show that results differ significantly depending on whether the workload queries have self-joins. Further, for queries without self-joins we describe a very compact search space of views, which contains all views in at least one optimal viewset. We present techniques for finding a minimum-size viewset for a single query without self-joins by using the shape of the query and its constraints, and validate the approach by extensive experiments. Part of this article was published elsewhere [Chirkova, R., Li, C.: Materializing views with minimal size to answer queries. PODS (2003)]. In addition to the prior materials, this article contains new theoretical results, as well as new results on how to efficiently implement the proposed techniques (Sects. 5 and 5.4) 相似文献

20.

Generalization of strategies for fuzzy query translation in classical relational databases

《Information and Software Technology》2007,49(2):172-180

Users of information systems would like to express flexible queries over the data possibly retrieving imperfect items when the perfect ones, which exactly match the selection conditions, are not available. Most commercial DBMSs are still based on the SQL for querying. Therefore, providing some flexibility to SQL can help users to improve their interaction with the systems without requiring them to learn a completely novel language. Based on the fuzzy set theory and the α-cut operation of fuzzy number, this paper presents the generic fuzzy queries against classical relational databases and develops the translation of the fuzzy queries. The generic fuzzy queries mean that the query condition consists of complex fuzzy terms as the operands and complex fuzzy relations as the operators in a fuzzy query. With different thresholds that the user chooses for the fuzzy query, the user’s fuzzy queries can be translated into precise queries for classical relational databases. 相似文献