首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 421 毫秒
1.
The Semantic Web’s promise of web-wide data integration requires the inclusion of legacy relational databases,1 i.e. the execution of SPARQL queries on RDF representation of the legacy relational data. We explore a hypothesis: existing commercial relational databases already subsume the algorithms and optimizations needed to support effective SPARQL execution on existing relationally stored data. The experiment is embodied in a system, Ultrawrap, that encodes a logical representation of the database as an RDF graph using SQL views and a simple syntactic translation of SPARQL queries to SQL queries on those views. Thus, in the course of executing a SPARQL query, the SQL optimizer uses the SQL views that represent a mapping of relational data to RDF, and optimizes its execution. In contrast, related research is predicated on incorporating optimizing transforms as part of the SPARQL to SQL translation, and/or executing some of the queries outside the underlying SQL environment.Ultrawrap is evaluated using two existing benchmark suites that derive their RDF data from relational data through a Relational Database to RDF (RDB2RDF) Direct Mapping and repeated for each of the three major relational database management systems. Empirical analysis reveals two existing relational query optimizations that, if applied to the SQL produced from a simple syntactic translations of SPARQL queries (with bound predicate arguments) to SQL, consistently yield query execution time that is comparable to that of SQL queries written directly for the relational representation of the data. The analysis further reveals the two optimizations are not uniquely required to achieve a successful wrapper system. The evidence suggests effective wrappers will be those that are designed to complement the optimizer of the target database.  相似文献   

2.
In a data mining project developed on a relational database, a significant effort is required to build a data set for analysis. The main reason is that, in general, the database has a collection of normalized tables that must be joined, aggregated and transformed in order to build the required data set. Such scenario results in many complex SQL queries that are written independently from each other, in a disorganized manner. Therefore, the database grows with many tables and views that are not present as entities in the ER model and similar SQL queries are written multiple times, creating problems in database evolution and software maintenance. In this paper, we classify potential database transformations, we extend an ER diagram with entities capturing database transformations and we introduce an algorithm which automates the creation of such extended ER model. We present a case study with a public database illustrating database transformations to build a data set to compute a typical data mining model.  相似文献   

3.
Modern database systems desperate for the ability to support highly scalable transactions and efficient queries simultaneously for real-time applications. One solution is to utilize query optimization techniques on the on-line transaction processing (OLTP) systems. The materialized view is considered as a panacea to decrease query latency. However, it also involves the significant cost of maintenance which trades away transaction performance. In this paper, we examine the design space and conclude several design features for the implementation of a view on a distributed log-structured merge-tree (LSMtree), which is a well-known structure for improving data write performance. As a result, we develop two incremental view maintenance (IVM) approaches on LSM-tree. One avoids join computation in view maintenance transactions. Another with two optimizations is proposed to decouple the view maintenance with the transaction process. Under the asynchronous update, we also provide consistency queries for views. Experiments on TPC-H benchmark show our methods achieve better performance than straightforward methods on different workloads.  相似文献   

4.
《Information Systems》2001,26(5):363-381
A data warehouse (DW) can be abstractly seen as a set of materialized views defined over a set of remote data sources. A DW is intended to satisfy a set of queries. The views materialized in a DW relate to each other in a complex manner, through common subexpressions, in order to guarantee high query performance and low view maintenance cost. DWs are time varying. As time passes new materialized views are added in order to satisfy new queries, or for performance reasons, while old queries are dropped. The evolution of a DW can result in a redundant set of materialized views. In this paper, we address the problem of detecting redundant materialized views in a given DW view selection, that is, materialized views that can be removed from DW without negatively affecting the query evaluation or the view maintenance process. Using an AND/OR dag representation for multiple queries and views, we first formalize the process of propagating source relation changes to the materialized views by exploiting common subexpressions between views and by using other materialized views that are not affected by these changes. Then, we provide an algorithm for detecting materialized views that are not needed in the process of propagating source relation changes to the DW. We also show how trivially redundant views can be identified in this process. Finally, we use these results to provide a procedure for detecting materialized views that are redundant in a DW. Our approach considers a broad class of views that includes grouping/aggregation views and is not dependent on a specific cost model.  相似文献   

5.
在Web应用中,以XML为格式的信息查询通常会受到网络传输速度有限等因素的影响。为了减少XML的物化视图与其数据源之间的一致性维护中所需的网络数据传输开销,提出了一种面向远程的XML物化视图增量维护方法和系统框架。这种方法根据多用户的查询请求和数据源更新信息,生成视图维护程序代码,以程序代码的网络迁移代替XML视图的重复查询,有效地减少了网络数据传输量。介绍了物化视图增量维护的基本原理、系统框架以及设计实现思路。最后通过性能测试,说明这种增量维护系统能够有效地减少传输开销。  相似文献   

6.
多库系统中高效的视图维护机制   总被引:1,自引:0,他引:1  
外连接是多库系统中经常使用的生成全局视图的方法 ,但是外连接给维护多库系统视图带来了很大的困难 .目前已有的多库系统视图维护算法只考虑了连接操作 (select project join,SPJ) ,如果全局视图是通过外连接生成的 ,这些算法就不能有效地维护多库系统视图的正确性 .提出了一种新的多库系统视图维护算法 ,它能够在有外连接及数据不一致的情况下高效地维护多库系统视图 ,并最大限度地减少了向局部数据库发送的查询数 ,使得多库系统更加高效  相似文献   

7.
Views are understood as a good means to tailor base relations individually to the needs of each user. However, if a user formulates his queries in terms of views he often has no chance to express these queries without joins. In terms of base relations many of these joins would not be necessary, and therefore the advantages of the view concept are payed for with a reduced performance. This study shows that this performance reduction can be avoided by automatically transforming a certain class of queries formulated in terms of views into equivalent queries on their base relations. This transformation is performed on the source level of SQL and uses the functional dependencies of the base relations to remove redundant join operations. Performance measurements in a real application of System/R show that this method is very efficient.  相似文献   

8.
Designing data warehouses   总被引:9,自引:0,他引:9  
A Data Warehouse (DW) is a database that collects and stores data from multiple remote and heterogeneous information sources. When a query is posed, it is evaluated locally, without accessing the original information sources. In this paper we deal with the issue of designing a DW, in the context of the relational model, by selecting a set of views to materialize in the DW. First, we briefly present a theoretical framework for the DW design problem, which concerns the selection of a set of views that (a) fit in the space allocated to the DW, (b) answer all the queries of interest, and (c) minimize the total query evaluation and view maintenance cost. We then formalize the DW design problem as a state space search problem by taking into account multiquery optimization over the maintenance queries (i.e., queries that compute changes to the materialized views) and the use of auxiliary views for reducing the view maintenance cost. Finally, incremental algorithms and heuristics for pruning the search space are presented.  相似文献   

9.
We present ViewDF: a flexible and declarative framework for incremental maintenance of materialized views (i.e., results of continuous queries) over streaming data. The main component of the proposed framework is the View Delta Function (ViewDF), which declaratively specifies how to update a materialized view when a new batch of data arrives. We describe and experimentally evaluate a prototype system based on this idea, which allows users to write ViewDFs directly and automatically translates common classes of streaming queries into ViewDFs. Our approach generalizes existing work on incremental view maintenance and enables new optimizations for views that are common in stream analytics, including those with pattern matching and sliding windows.  相似文献   

10.
《Information Systems》2001,26(5):323-362
We consider a variant of the view maintenance problem: How does one keep a materialized view up-to-date when the view definition itself changes? Can one do better than recomputing the view from the base relations? Traditional view maintenance tries to maintain the materialized view in response to modifications to the base relations; we try to “adapt” the view in response to changes in the view definition.Such techniques are needed for applications where the user can change queries dynamically and wants to see the changes in the results fast. Data archaeology, data visualization, and dynamic queries are examples of such applications. Views defined over the Internet tend to evolve and our technique can be useful for adapting such views.We consider all possible redefinitions of SQL SELECT-FROM-WHERE-GROUP-BY-HAVING, UNION, and EXCEPT views, and show how these views can be adapted using the old materialization for the cases where it is possible to do so. We identify extra information that can be kept with a materialization to facilitate redefinition. Multiple simultaneous changes to a view can be handled without necessarily materializing intermediate results. We identify guidelines for users and database administrators that can be used to facilitate efficient view adaptation.We perform a systematic experimental evaluation of our proposed techniques. Our evaluation indicates that adaptation is much more efficient than rematerialization in most cases. In-place adaptation methods are better than the non-in-place methods when the change is small. We also point out some important factors that can impact the efficiency of adaptation.  相似文献   

11.
Selection of views to materialize in a data warehouse   总被引:4,自引:0,他引:4  
A data warehouse stores materialized views of data from one or more sources, with the purpose of efficiently implementing decision-support or OLAP queries. One of the most important decisions in designing a data warehouse is the selection of materialized views to be maintained at the warehouse. The goal is to select an appropriate set of views that minimizes total query response time and the cost of maintaining the selected views, given a limited amount of resource, e.g., materialization time, storage space, etc. In This work, we have developed a theoretical framework for the general problem of selection of views in a data warehouse. We present polynomial-time heuristics for a selection of views to optimize total query response time under a disk-space constraint, for some important special cases of the general data warehouse scenario, viz.: 1) an AND view graph, where each query/view has a unique evaluation, e.g., when a multiple-query optimizer can be used to general a global evaluation plan for the queries, and 2) an OR view graph, in which any view can be computed from any one of its related views, e.g., data cubes. We present proofs showing that the algorithms are guaranteed to provide a solution that is fairly close to (within a constant factor ratio of) the optimal solution. We extend our heuristic to the general AND-OR view graphs. Finally, we address in detail the view-selection problem under the maintenance cost constraint and present provably competitive heuristics.  相似文献   

12.
Materialized views defined over distributed data sources can be utilized by many applications to ensure better access, reliable performance, and high availability. Technology for maintaining materialized views is thus critical for providing up-to-date results since a stale view extent may not help or even mislead these applications. State-of-the-art incremental view maintenance requires O(n2)O(n2) or more remote maintenance queries with n being the number of data sources in the view definition. In this work, we propose two novel maintenance strategies, namely adjacent grouping and conditional grouping, that dramatically reduce the number of maintenance queries required to maintain the materialized views. This reduction in the number of maintenance queries brings the basic trade-off between the complexity of each query and the total number of maintenance queries that can be exploited to improve maintenance performance. The proposed maintenance strategies have been implemented in a working prototype system called TxnWrap. Experimental studies illustrate that our proposed strategies are able to achieve about 400% performance improvement in terms of total processing time compared with existing batch algorithms in a majority of cases.  相似文献   

13.
We study the problem of maintaining recursively defined views, such as the transitive closure of a relation, in traditional relational languages that do not have recursion mechanisms. The main results of this paper are negative ones: we show that a certain property of query languages implies impossibility of such incremental maintenance. The property we use is locality of queries, which is known to hold for relational calculus and various extensions, including those with grouping and aggregate constructs (essentially, plain SQL).  相似文献   

14.
View selection for designing the global data warehouse   总被引:1,自引:0,他引:1  
A global data warehouse (DW) integrates data from multiple distributed heterogeneous databases and other information sources. A global DW can be abstractly seen as a set of materialized views. The selection of views for materialization in a DW is an important decision in the design of a DW. Current commercial products do not provide tools for automatic DW design. We provide a general method that, given a set of select-project-join queries to be satisfied by the DW, generates sets of materialized views that satisfy all the input queries. This process is complex since ‘common subexpressions' between the queries need to be detected and exploited. Our method is then applied to solve the problem of selecting such a materialized view set that fits in the space allocated to the DW for materialization and minimizes the combined overall query evaluation and view maintenance cost. We design algorithms which are implemented and we report on their experimental evaluation.  相似文献   

15.
The view selection problem is to choose a set of views to materialize over a database schema, such that the cost of evaluating a set of workload queries is minimized and such that the views fit into a prespecified storage constraint. The two main applications of the view selection problem are materializing views in a database to speed up query processing, and selecting views to materialize in a data warehouse to answer decision support queries. In addition, view selection is a core problem for intelligent data placement over a wide-area network for data integration applications and data management for ubiquitous computing. We describe several fundamental results concerning the view selection problem. We consider the problem for views and workloads that consist of equality-selection, project and join queries, and show that the complexity of the problem depends crucially on the quality of the estimates that a query optimizer has on the size of the views it is considering to materialize. When a query optimizer has good estimates of the sizes of the views, we show a somewhat surprising result, namely, that an optimal choice of views may involve a number of views that is exponential in the size of the database schema. On the other hand, when an optimizer uses standard estimation heuristics, we show that the number of necessary views and the expression size of each view are polynomially bounded. Received: November 20, 1001 / Accepted: May 30, 2002 / Published online: September 25, 2002  相似文献   

16.
Effective support for temporal applications by database systems represents an important technical objective that is difficult to achieve since it requires an integrated solution for several problems, including (i) expressive temporal representations and data models, (ii) powerful languages for temporal queries and snapshot queries, (iii) indexing, clustering and query optimization techniques for managing temporal information efficiently, and (iv) architectures that bring together the different pieces of enabling technology into a robust system. In this paper, we present the ArchIS system that achieves these objectives by supporting a temporally grouped data model on top of RDBMS. ArchIS’ architecture uses (a) XML to support temporally grouped (virtual) representations of the database history, (b) XQuery to express powerful temporal queries on such views, (c) temporal clustering and indexing techniques for managing the actual historical data in a relational database, and (d) SQL/XML for executing the queries on the XML views as equivalent queries on the relational database. The performance studies presented in the paper show that ArchIS is quite effective at storing and retrieving under complex query conditions the transaction-time history of relational databases, and can also assure excellent storage efficiency by providing compression as an option. This approach achieves full-functionality transaction-time databases without requiring temporal extensions in XML or database standards, and provides critical support to emerging application areas such as RFID.  相似文献   

17.
基于XML的半结构数据的视图问题研究   总被引:1,自引:0,他引:1  
1 引言数据库中的视图机制主要是根据用户或应用的需要对数据进行剪裁以增加数据库的灵活性。数据库的视图是适合某一特定用户或应用的数据库中部分数据的一种抽象。视图是依照视图声明语言(View Specification Language)来定义的,视图的声明是施加于源数据库(或等价的基数据库)上的。通常,数据库视图既可以是虚拟的(Virtual)、也可以是实际化的  相似文献   

18.
Inductive databases integrate database querying with database mining. In this article, we present an inductive database system that does not rely on a new data mining query language, but on plain SQL. We propose an intuitive and elegant framework based on virtual mining views, which are relational tables that virtually contain the complete output of data mining algorithms executed over a given data table. We show that several types of patterns and models that are implicitly present in the data, such as itemsets, association rules, and decision trees, can be represented and queried with SQL using a unifying framework. As a proof of concept, we illustrate a complete data mining scenario with SQL queries over the mining views, which is executed in our system.  相似文献   

19.
A data warehouse (DW) can be seen as a set of materialized views defined over remote base relations. When a query is posed, it is evaluated locally, using the materialized views, without accessing the original information sources. The DWs are dynamic entities that evolve continuously over time. As time passes, new queries need to be answered by them. Some of these queries can be answered using exclusively the materialized views. In general though new views need to be added to the DW.In this paper we investigate the problem of incrementally designing a DW when new queries need to be answered and possibly extra space is allocated for view materialization. Based on an AND/OR dag representation of multiple queries, we model the problem as a state space search problem. We design incremental algorithms for selecting a set of new views to additionally materialize in the DW that: (a) fits in the extra space, (b) allows a complete rewriting of the new queries over the materialized views, and (c) minimizes the combined new query evaluation and new view maintenance cost. Finally, we discuss methods for pruning the search space so that efficiency is improved.  相似文献   

20.
Users of information systems would like to express flexible queries over the data possibly retrieving imperfect items when the perfect ones, which exactly match the selection conditions, are not available. Most commercial DBMSs are still based on the SQL for querying. Therefore, providing some flexibility to SQL can help users to improve their interaction with the systems without requiring them to learn a completely novel language. Based on the fuzzy set theory and the α-cut operation of fuzzy number, this paper presents the generic fuzzy queries against classical relational databases and develops the translation of the fuzzy queries. The generic fuzzy queries mean that the query condition consists of complex fuzzy terms as the operands and complex fuzzy relations as the operators in a fuzzy query. With different thresholds that the user chooses for the fuzzy query, the user’s fuzzy queries can be translated into precise queries for classical relational databases.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号