首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Theoretical models used in database research often have subtle differences with those occurring in practice. One particular mismatch that is usually neglected concerns the use of marked nulls to represent missing values in theoretical models of incompleteness, while in an SQL database these are all denoted by the same syntactic
object. It is commonly argued that results obtained in the model with marked nulls carry over to SQL, because SQL nulls can be interpreted as Codd nulls, which are simply marked nulls that do not repeat. This argument, however, does not take into account that even simple queries may produce answers where distinct occurrences of
do in fact denote the same unknown value. For such queries, interpreting SQL nulls as Codd nulls would incorrectly change the semantics of query answers. To use results about Codd nulls for real-life SQL queries, we need to understand which queries preserve the Codd interpretation of SQL nulls. We show, however, that the class of relational algebra queries preserving Codd interpretation is not recursively enumerable, which necessitates looking for sufficient conditions for such preservation. Those can be obtained by exploiting the information provided by NOT NULL constraints on the database schema. We devise mild syntactic restrictions on queries that guarantee preservation, do not limit the full expressiveness of queries on databases without nulls, and can be checked efficiently.  相似文献   

2.
We present a high level query language, called HIFUN, for defining analytic queries over big datasets, independently of how these queries are evaluated. An analytic query in HIFUN is defined to be a well-formed expression of a functional algebra that we define in the paper. The operations of this algebra combine functions to create HIFUN queries in much the same way as the operations of the relational algebra combine relations to create algebraic queries. The contributions of this paper are: (a) the definition of a formal framework in which to study analytic queries in the abstract; (b) the encoding of a HIFUN query either as a MapReduce job or as an SQL group-by query; and (c) the definition of a formal method for rewriting HIFUN queries and, as a case study, its application to the rewriting of MapReduce jobs and of SQL group-by queries. We emphasize that, although theoretical in nature, our work uses only basic and well known mathematical concepts, namely functions and their basic operations.  相似文献   

3.
4.
Coupled transformation occurs when multiple software artifacts must be transformed in such a way that they remain consistent with each other. For instance, when a database schema is adapted in the context of system maintenance, the persistent data residing in the system's database needs to be migrated to conform to the adapted schema. Also, queries embedded in the application code and any declared referential constraints must be adapted to take the schema changes into account. As another example, in XML-to-relational data mapping, a hierarchical XML Schema is mapped to a relational SQL schema with appropriate referential constraints, and the XML documents and queries are converted into relational data and relational queries. The 2LT project is aimed at providing a formal basis for coupled transformation. This formal basis is found in data refinement theory, point-free program calculation, and strategic term rewriting. We formalize the coupled transformation of a data type by an algebra of information-preserving data refinement steps, each witnessed by appropriate data conversion functions. Refinement steps are modeled by so-called two-level rewrite rules on type expressions that synthesize conversion functions between redex and reduct while rewriting. Strategy combinators are used to composed two-level rewrite rules into complete rewrite systems. Point-free program calculation is applied to optimized synthesize conversion function, to migrate queries, and to normalize data type constraints. In this paper, we provide an overview of the challenges met by the 2LT project and we give a sketch of the solutions offered.  相似文献   

5.
The Semantic Web’s promise of web-wide data integration requires the inclusion of legacy relational databases,1 i.e. the execution of SPARQL queries on RDF representation of the legacy relational data. We explore a hypothesis: existing commercial relational databases already subsume the algorithms and optimizations needed to support effective SPARQL execution on existing relationally stored data. The experiment is embodied in a system, Ultrawrap, that encodes a logical representation of the database as an RDF graph using SQL views and a simple syntactic translation of SPARQL queries to SQL queries on those views. Thus, in the course of executing a SPARQL query, the SQL optimizer uses the SQL views that represent a mapping of relational data to RDF, and optimizes its execution. In contrast, related research is predicated on incorporating optimizing transforms as part of the SPARQL to SQL translation, and/or executing some of the queries outside the underlying SQL environment.Ultrawrap is evaluated using two existing benchmark suites that derive their RDF data from relational data through a Relational Database to RDF (RDB2RDF) Direct Mapping and repeated for each of the three major relational database management systems. Empirical analysis reveals two existing relational query optimizations that, if applied to the SQL produced from a simple syntactic translations of SPARQL queries (with bound predicate arguments) to SQL, consistently yield query execution time that is comparable to that of SQL queries written directly for the relational representation of the data. The analysis further reveals the two optimizations are not uniquely required to achieve a successful wrapper system. The evidence suggests effective wrappers will be those that are designed to complement the optimizer of the target database.  相似文献   

6.
本文通过实例的引用,描述多用户关系数据库管理系统KD-Base中SQL嵌套查询的优化实现技术,并给出从KD-SQL基本查询到关系代数查询的优化转换方法。  相似文献   

7.
Semantics preserving SPARQL-to-SQL translation   总被引:2,自引:0,他引:2  
Most existing RDF stores, which serve as metadata repositories on the Semantic Web, use an RDBMS as a backend to manage RDF data. This motivates us to study the problem of translating SPARQL queries into equivalent SQL queries, which further can be optimized and evaluated by the relational query engine and their results can be returned as SPARQL query solutions. The main contributions of our research are: (i) We formalize a relational algebra based semantics of SPARQL, which bridges the gap between SPARQL and SQL query languages, and prove that our semantics is equivalent to the mapping-based semantics of SPARQL; (ii) Based on this semantics, we propose the first provably semantics preserving SPARQL-to-SQL translation for SPARQL triple patterns, basic graph patterns, optional graph patterns, alternative graph patterns, and value constraints; (iii) Our translation algorithm is generic and can be directly applied to existing RDBMS-based RDF stores; and (iv) We outline a number of simplifications for the SPARQL-to-SQL translation to generate simpler and more efficient SQL queries and extend our defined semantics and translation to support the bag semantics of a SPARQL query solution. The experimental study showed that our proposed generic translation can serve as a good alternative to existing schema dependent translations in terms of efficient query evaluation and/or ensured query result correctness.  相似文献   

8.
Performing complex analysis on top of massive data stores is essential to most modern enterprises and organizations and requires simple, flexible and powerful syntactic constructs to express naturally and succinctly complex decision support queries. In addition, these linguistic features have to be coupled by appropriate evaluation and optimization techniques in order to efficiently compute these queries. In this article we review the concept of grouping variable and describe a simple SQL extension to match it. We show that this extension enables the facile expression of a large class of practical data analysis queries. Besides syntactic simplicity, grouping variables can be neatly modeled in relational algebra via a relational operator, called MD-join. MD-join combines joins and group-bys (a frequent case in decision support queries) into one operator, allowing novel evaluation and optimization techniques. By making explicit how joins interact with group bys, we provide the optimizer with enough information to use specific algorithms and employ appropriate optimization plans, not easily detectable previously. Several experiments demonstrate substantial performance improvements, in some cases of one or two orders of magnitude. The work on grouping variables have influenced at least one commercial system and the standardization of ANSI SQL and implementations of it have been studied in the context of telecom applications, medical and bio-informatics, finance and others. Finally, current work studies the potential of grouping variables in formulating decision support queries over streams of data, one of the latest research trends in database community.  相似文献   

9.
This research investigates and approach to query processing in a multidatabase system that uses an objectoriented model to capture the semantics of other data models. The object-oriented model is used to construct a global schema, defining an integrated view of the different schemas in the environment. The model is also used as a self-describing model to build a meta-database for storing information about the global schema. A unique aspect of this work is that the object-oriented model is used to describe the different data models of the multidatabase environment, thereby extending the meta database with semantic information about the local schemas. With the global and local schemas all represented in an object-oriented form, structural mappings between the global schema and each local schema are then easily supported. An object algebra then provides a query language for expressing global queries, using the structural mappings to translate object algebra queries into SQL queries over local relational schema. The advantage of using an object algebra is that the object-oriented database can be viewed as a blackboard for temporary storage of local data and for establishing relationships between different databases. The object algebra can be used to directly retrieve temporarily-stored data from the object-oriented database or to transparently retrieve data from local sources using the translation process described in this paper.  相似文献   

10.
Identifying similarities in large datasets is an essential operation in several applications such as bioinformatics, pattern recognition, and data integration. To make a relational database management system similarity-aware, the core relational operators have to be extended. While similarity-awareness has been introduced in database engines for relational operators such as joins and group-by, little has been achieved for relational set operators, namely Intersection, Difference, and Union. In this paper, we propose to extend the semantics of relational set operators to take into account the similarity of values. We develop efficient query processing algorithms for evaluating them, and implement these operators inside an open-source database system, namely PostgreSQL. By extending several queries from the TPC-H benchmark to include predicates that involve similarity-based set operators, we perform extensive experiments that demonstrate up to three orders of magnitude speedup in performance over equivalent queries that only employ regular operators.  相似文献   

11.
We introduce a new abstract model of database query processing, finite cursor machines, that incorporates certain data streaming aspects. The model describes quite faithfully what happens in so-called “one-pass” and “two-pass query processing”. Technically, the model is described in the framework of abstract state machines. Our main results are upper and lower bounds for processing relational algebra queries in this model, specifically, queries of the semijoin fragment of the relational algebra.  相似文献   

12.
Two important issues in the design of relational model banks are the degree to which they should be aggregated or disaggregated and the methods by which disaggregated model banks might be integrated in response to user queries. Three topics relevant to this issue are addressed in this paper. The first is whether a universal model and its projections may possess the lossy join property. We will show that they do not. The second is the development of a relational algebra for the specification of join implementation in model banks, and the third is the realization of such an algebra in a language similar to Query-by-Example.  相似文献   

13.
Different classes of recursive queries in the relational databases are identified. It is shown that existing proposals to extend the relational query languages are either not powerful enough to express queries in many of these classes or use nonfirst normal form constructs. RQL, a recursive database query language that can be used to express recursive queries on all the classes identified, is presented. RQL is based on the relational algebra. In addition to functions that correspond to the standard and extended relational algebra operators, RQL supports functions required to express general recursive queries. The elements of RQL and the ways in which they are used to formulate complicated, but useful, recursive queries are described. The effects of the extensions embodied in RQL on the termination of recursive query evaluation are discussed  相似文献   

14.
FP is the programming language defined by J. Backus to demonstrate the virtues of functional programming as opposed to conventional programming in Von Neumann-like languages.In this paper we investigate the use of FP in the framework of relational data bases. In particular, we show how the language can be used to define base relations, to derive views from a collection of relations, and to express complex database queries.The language provides all capabilities of pure algebraic relational languages, but is considerably more powerful. As such, it can be used as a formal specification language to describe the semantics of queries expressed in relational languages, such as Query-By-Example. In addition the algebra of FP programs allows one to formally prove properties of such queries.  相似文献   

15.
We present a general rank-aware model of data which supports handling of similarity in relational databases. The model is based on the assumption that in many cases it is desirable to replace equalities on values in data tables by similarity relations expressing degrees to which the values are similar. In this context, we study various phenomena which emerge in the model, including similarity-based queries and similarity-based data dependencies. Central notion in our model is that of a ranked data table over domains with similarities which is our counterpart to the notion of relation on relation scheme from the classical relational model. Compared to other approaches which cover related problems, we do not propose a similarity-based or ranking module on top of the classical relational model. Instead, we generalize the very core of the model by replacing the classical, two-valued logic upon which the classical model is built by a more general logic involving a scale of truth degrees that, in addition to the classical truth degrees 0 and 1, contains intermediate truth degrees. While the classical truth degrees 0 and 1 represent nonequality and equality of values, and subsequently mismatch and match of queries, the intermediate truth degrees in the new model represent similarity of values and partial match of queries. Moreover, the truth functions of many-valued logical connectives in the new model serve to aggregate degrees of similarity. The presented approach is conceptually clean, logically sound, and retains most properties of the classical model while enabling us to employ new types of queries and data dependencies. Most importantly, similarity is not handled in an ad hoc way or by putting a “similarity module” atop the classical model in our approach. Rather, it is consistently viewed as a notion that generalizes and replaces equality in the very core of the relational model. We present fundamentals of the formal model and two equivalent query systems which are analogues of the classical relational algebra and domain relational calculus with range declarations. In the sequel to this paper, we deal with similarity-based dependencies.  相似文献   

16.
肖杰  谢东  曾玢石 《计算机工程》2009,35(16):73-75
已经存在的SQL聚集函数在计算百分比方面有明显的限制,提出2种水平与垂直百分比聚集函数用于计算百分比。新的聚集函数容易使用,有广泛的用途,可以被用于作为一种框架去研究百分比查询,能有效地生成SQL代码。实验研究给出百分比聚集方法与SQL/OLAP聚集方法的执行性能,结果显示这2种方法在性能上比SQL/OLAP聚集方法有一定的改善。  相似文献   

17.
Query languages for relational multidatabases   总被引:2,自引:0,他引:2  
With the existence of many autonomous databases widely accessible through computer networks, users will require the capability to jointly manipulate data in different databases. A multidatabase system provides such a capability through a multidatabase manipulation language, such as MSQL. We propose a theoretical foundation for such languages by presenting a multirelational algebra and calculus based on the relational algebra and calculus. The proposal is illustrated by various queries on an example multidatabase. It is shown that properties of the multirelational algebra may be used for optimization and that every multirelational algebra query can be expressed as a multirelational calculus query. The connection between the multirelational languages and MSQL, the multidatabase version of SQL, is also investigated.  相似文献   

18.
XML is rapidly emerging as a standard for exchanging business data on the World Wide Web. For the foreseeable future, however, most business data will continue to be stored in relational database systems. Consequently, if XML is to fulfill its potential, some mechanism is needed to publish relational data as XML documents. Towards that goal, one of the major challenges is finding a way to efficiently structure and tag data from one or more tables as a hierarchical XML document. Different alternatives are possible depending on when this processing takes place and how much of it is done inside the relational engine. In this paper, we characterize and study the performance of these alternatives. Among other things, we explore the use of new scalar and aggregate functions in SQL for constructing complex XML documents directly in the relational engine. We also explore different execution plans for generating the content of an XML document. The results of an experimental study show that constructing XML documents inside the relational engine can have a significant performance benefit. Our results also show the superiority of having the relational engine use what we call an “outer union plan” to generate the content of an XML document. Received: 15 October 2000 / Accepted: 15 April 2001 Published online: 28 June 2001  相似文献   

19.
在使用C++开发数据库相关的应用程序时,SQL语句的产生在程序编译期间并不会进行必要的检查。本文研究在编译期间使用C++编译器对关系代数运算作检查,由关系代数生成正确的SQL查询,将运行期SQL查询的部分检查工作提前到程序的编译期间处理。  相似文献   

20.
We study the problem of maintaining recursively defined views, such as the transitive closure of a relation, in traditional relational languages that do not have recursion mechanisms. The main results of this paper are negative ones: we show that a certain property of query languages implies impossibility of such incremental maintenance. The property we use is locality of queries, which is known to hold for relational calculus and various extensions, including those with grouping and aggregate constructs (essentially, plain SQL).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号