期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Ultrawrap: SPARQL execution on relational data

《Journal of Web Semantics》2013

The Semantic Web’s promise of web-wide data integration requires the inclusion of legacy relational databases,¹ i.e. the execution of SPARQL queries on RDF representation of the legacy relational data. We explore a hypothesis: existing commercial relational databases already subsume the algorithms and optimizations needed to support effective SPARQL execution on existing relationally stored data. The experiment is embodied in a system, Ultrawrap, that encodes a logical representation of the database as an RDF graph using SQL views and a simple syntactic translation of SPARQL queries to SQL queries on those views. Thus, in the course of executing a SPARQL query, the SQL optimizer uses the SQL views that represent a mapping of relational data to RDF, and optimizes its execution. In contrast, related research is predicated on incorporating optimizing transforms as part of the SPARQL to SQL translation, and/or executing some of the queries outside the underlying SQL environment.Ultrawrap is evaluated using two existing benchmark suites that derive their RDF data from relational data through a Relational Database to RDF (RDB2RDF) Direct Mapping and repeated for each of the three major relational database management systems. Empirical analysis reveals two existing relational query optimizations that, if applied to the SQL produced from a simple syntactic translations of SPARQL queries (with bound predicate arguments) to SQL, consistently yield query execution time that is comparable to that of SQL queries written directly for the relational representation of the data. The analysis further reveals the two optimizations are not uniquely required to achieve a successful wrapper system. The evidence suggests effective wrappers will be those that are designed to complement the optimizer of the target database. 相似文献

2.

Extending ER models to capture database transformations to build data sets for data mining

《Data & Knowledge Engineering》2014

In a data mining project developed on a relational database, a significant effort is required to build a data set for analysis. The main reason is that, in general, the database has a collection of normalized tables that must be joined, aggregated and transformed in order to build the required data set. Such scenario results in many complex SQL queries that are written independently from each other, in a disorganized manner. Therefore, the database grows with many tables and views that are not present as entities in the ER model and similar SQL queries are written multiple times, creating problems in database evolution and software maintenance. In this paper, we classify potential database transformations, we extend an ER diagram with entities capturing database transformations and we introduce an algorithm which automates the creation of such extended ER model. We present a case study with a public database illustrating database transformations to build a data set to compute a typical data mining model. 相似文献

3.

Incremental join view maintenance on distributed log-structured storage

Huichao DUAN Huiqi HU Weining QIAN Aoying ZHOU 《Frontiers of Computer Science》2021,15(4):154607

Modern database systems desperate for the ability to support highly scalable transactions and efficient queries simultaneously for real-time applications. One solution is to utilize query optimization techniques on the on-line transaction processing (OLTP) systems. The materialized view is considered as a panacea to decrease query latency. However, it also involves the significant cost of maintenance which trades away transaction performance. In this paper, we examine the design space and conclude several design features for the implementation of a view on a distributed log-structured merge-tree (LSMtree), which is a well-known structure for improving data write performance. As a result, we develop two incremental view maintenance (IVM) approaches on LSM-tree. One avoids join computation in view maintenance transactions. Another with two optimizations is proposed to decouple the view maintenance with the transaction process. Under the asynchronous update, we also provide consistency queries for views. Experiments on TPC-H benchmark show our methods achieve better performance than straightforward methods on different workloads. 相似文献

4.

Detecting redundant materialized views in data warehouse evolution

《Information Systems》2001,26(5):363-381

A data warehouse (DW) can be abstractly seen as a set of materialized views defined over a set of remote data sources. A DW is intended to satisfy a set of queries. The views materialized in a DW relate to each other in a complex manner, through common subexpressions, in order to guarantee high query performance and low view maintenance cost. DWs are time varying. As time passes new materialized views are added in order to satisfy new queries, or for performance reasons, while old queries are dropped. The evolution of a DW can result in a redundant set of materialized views. In this paper, we address the problem of detecting redundant materialized views in a given DW view selection, that is, materialized views that can be removed from DW without negatively affecting the query evaluation or the view maintenance process. Using an AND/OR dag representation for multiple queries and views, we first formalize the process of propagating source relation changes to the materialized views by exploiting common subexpressions between views and by using other materialized views that are not affected by these changes. Then, we provide an algorithm for detecting materialized views that are not needed in the process of propagating source relation changes to the DW. We also show how trivially redundant views can be identified in this process. Finally, we use these results to provide a procedure for detecting materialized views that are redundant in a DW. Our approach considers a broad class of views that includes grouping/aggregation views and is not dependent on a specific cost model. 相似文献

5.

XQuery物化视图增量更新系统框架的研究

彭蕾廖湖声金雪云《计算机应用与软件》2011,28(6)

在Web应用中,以XML为格式的信息查询通常会受到网络传输速度有限等因素的影响。为了减少XML的物化视图与其数据源之间的一致性维护中所需的网络数据传输开销,提出了一种面向远程的XML物化视图增量维护方法和系统框架。这种方法根据多用户的查询请求和数据源更新信息,生成视图维护程序代码,以程序代码的网络迁移代替XML视图的重复查询,有效地减少了网络数据传输量。介绍了物化视图增量维护的基本原理、系统框架以及设计实现思路。最后通过性能测试,说明这种增量维护系统能够有效地减少传输开销。相似文献

6.

多库系统中高效的视图维护机制 总被引：1，自引：0，他引：1

韩伟红贾焰王志英杨晓东《计算机研究与发展》2000,37(7):789-795

外连接是多库系统中经常使用的生成全局视图的方法 ,但是外连接给维护多库系统视图带来了很大的困难 .目前已有的多库系统视图维护算法只考虑了连接操作 (select project join,SPJ) ,如果全局视图是通过外连接生成的 ,这些算法就不能有效地维护多库系统视图的正确性 .提出了一种新的多库系统视图维护算法 ,它能够在有外连接及数据不一致的情况下高效地维护多库系统视图 ,并最大限度地减少了向局部数据库发送的查询数 ,使得多库系统更加高效相似文献

7.

Removing redundant join operations in queries involving views

Nikolaus Ott Klaus Horländer 《Information Systems》1985,10(3):279-288

Views are understood as a good means to tailor base relations individually to the needs of each user. However, if a user formulates his queries in terms of views he often has no chance to express these queries without joins. In terms of base relations many of these joins would not be necessary, and therefore the advantages of the view concept are payed for with a reduced performance. This study shows that this performance reduction can be avoided by automatically transforming a certain class of queries formulated in terms of views into equivalent queries on their base relations. This transformation is performed on the source level of SQL and uses the functional dependencies of the base relations to remove redundant join operations. Performance measurements in a real application of System/R show that this method is very efficient. 相似文献

8.

Designing data warehouses 总被引：9，自引：0，他引：9

Dimitri Timos 《Data & Knowledge Engineering》1999,31(3):279-301

A Data Warehouse (DW) is a database that collects and stores data from multiple remote and heterogeneous information sources. When a query is posed, it is evaluated locally, without accessing the original information sources. In this paper we deal with the issue of designing a DW, in the context of the relational model, by selecting a set of views to materialize in the DW. First, we briefly present a theoretical framework for the DW design problem, which concerns the selection of a set of views that (a) fit in the space allocated to the DW, (b) answer all the queries of interest, and (c) minimize the total query evaluation and view maintenance cost. We then formalize the DW design problem as a state space search problem by taking into account multiquery optimization over the maintenance queries (i.e., queries that compute changes to the materialized views) and the use of auxiliary views for reducing the view maintenance cost. Finally, incremental algorithms and heuristics for pruning the search space are presented. 相似文献

9.

ViewDF: Declarative incremental view maintenance for streaming data

《Information Systems》2017

We present ViewDF: a flexible and declarative framework for incremental maintenance of materialized views (i.e., results of continuous queries) over streaming data. The main component of the proposed framework is the View Delta Function (ViewDF), which declaratively specifies how to update a materialized view when a new batch of data arrives. We describe and experimentally evaluate a prototype system based on this idea, which allows users to write ViewDFs directly and automatically translates common classes of streaming queries into ViewDFs. Our approach generalizes existing work on incremental view maintenance and enables new optimizations for views that are common in stream analytics, including those with pattern matching and sliding windows. 相似文献

10.

Adapting materialized views after redefinitions: techniques and a performance study

《Information Systems》2001,26(5):323-362

We consider a variant of the view maintenance problem: How does one keep a materialized view up-to-date when the view definition itself changes? Can one do better than recomputing the view from the base relations? Traditional view maintenance tries to maintain the materialized view in response to modifications to the base relations; we try to “adapt” the view in response to changes in the view definition.Such techniques are needed for applications where the user can change queries dynamically and wants to see the changes in the results fast. Data archaeology, data visualization, and dynamic queries are examples of such applications. Views defined over the Internet tend to evolve and our technique can be useful for adapting such views.We consider all possible redefinitions of SQL SELECT-FROM-WHERE-GROUP-BY-HAVING, UNION, and EXCEPT views, and show how these views can be adapted using the old materialization for the cases where it is possible to do so. We identify extra information that can be kept with a materialization to facilitate redefinition. Multiple simultaneous changes to a view can be handled without necessarily materializing intermediate results. We identify guidelines for users and database administrators that can be used to facilitate efficient view adaptation.We perform a systematic experimental evaluation of our proposed techniques. Our evaluation indicates that adaptation is much more efficient than rematerialization in most cases. In-place adaptation methods are better than the non-in-place methods when the change is small. We also point out some important factors that can impact the efficiency of adaptation. 相似文献

11.

Selection of views to materialize in a data warehouse 总被引：4，自引：0，他引：4

Gupta H. Mumick I.S. 《Knowledge and Data Engineering, IEEE Transactions on》2005,17(1):24-43

A data warehouse stores materialized views of data from one or more sources, with the purpose of efficiently implementing decision-support or OLAP queries. One of the most important decisions in designing a data warehouse is the selection of materialized views to be maintained at the warehouse. The goal is to select an appropriate set of views that minimizes total query response time and the cost of maintaining the selected views, given a limited amount of resource, e.g., materialization time, storage space, etc. In This work, we have developed a theoretical framework for the general problem of selection of views in a data warehouse. We present polynomial-time heuristics for a selection of views to optimize total query response time under a disk-space constraint, for some important special cases of the general data warehouse scenario, viz.: 1) an AND view graph, where each query/view has a unique evaluation, e.g., when a multiple-query optimizer can be used to general a global evaluation plan for the queries, and 2) an OR view graph, in which any view can be computed from any one of its related views, e.g., data cubes. We present proofs showing that the algorithms are guaranteed to provide a solution that is fairly close to (within a constant factor ratio of) the optimal solution. We extend our heuristic to the general AND-OR view graphs. Finally, we address in detail the view-selection problem under the maintenance cost constraint and present provably competitive heuristics. 相似文献

12.

Maintaining large update batches by restructuring and grouping

Bin Liu Elke A. Rundensteiner David Finkel 《Information Systems》2007

Materialized views defined over distributed data sources can be utilized by many applications to ensure better access, reliable performance, and high availability. Technology for maintaining materialized views is thus critical for providing up-to-date results since a stale view extent may not help or even mislead these applications. State-of-the-art incremental view maintenance requires O(n²)

O (n^{2})

or more remote maintenance queries with n being the number of data sources in the view definition. In this work, we propose two novel maintenance strategies, namely adjacent grouping and conditional grouping, that dramatically reduce the number of maintenance queries required to maintain the materialized views. This reduction in the number of maintenance queries brings the basic trade-off between the complexity of each query and the total number of maintenance queries that can be exploited to improve maintenance performance. The proposed maintenance strategies have been implemented in a working prototype system called TxnWrap. Experimental studies illustrate that our proposed strategies are able to achieve about 400% performance improvement in terms of total processing time compared with existing batch algorithms in a majority of cases. 相似文献

13.

Incremental recomputation in local languages

Guozhu Dong Leonid Libkin Limsoon Wong 《Information and Computation》2003,181(2):88-98

We study the problem of maintaining recursively defined views, such as the transitive closure of a relation, in traditional relational languages that do not have recursion mechanisms. The main results of this paper are negative ones: we show that a certain property of query languages implies impossibility of such incremental maintenance. The property we use is locality of queries, which is known to hold for relational calculus and various extensions, including those with grouping and aggregate constructs (essentially, plain SQL). 相似文献

14.

View selection for designing the global data warehouse 总被引：1，自引：0，他引：1

Dimitri Spyros Timos 《Data & Knowledge Engineering》2001,39(3):219-240

A global data warehouse (DW) integrates data from multiple distributed heterogeneous databases and other information sources. A global DW can be abstractly seen as a set of materialized views. The selection of views for materialization in a DW is an important decision in the design of a DW. Current commercial products do not provide tools for automatic DW design. We provide a general method that, given a set of select-project-join queries to be satisfied by the DW, generates sets of materialized views that satisfy all the input queries. This process is complex since ‘common subexpressions' between the queries need to be detected and exploited. Our method is then applied to solve the problem of selecting such a materialized view set that fits in the space allocated to the DW for materialization and minimizes the combined overall query evaluation and view maintenance cost. We design algorithms which are implemented and we report on their experimental evaluation. 相似文献

15.

A formal perspective on the view selection problem

Rada Chirkova Alon Y. Halevy Dan Suciu 《The VLDB Journal The International Journal on Very Large Data Bases》2002,11(3):216-237

The view selection problem is to choose a set of views to materialize over a database schema, such that the cost of evaluating a set of workload queries is minimized and such that the views fit into a prespecified storage constraint. The two main applications of the view selection problem are materializing views in a database to speed up query processing, and selecting views to materialize in a data warehouse to answer decision support queries. In addition, view selection is a core problem for intelligent data placement over a wide-area network for data integration applications and data management for ubiquitous computing. We describe several fundamental results concerning the view selection problem. We consider the problem for views and workloads that consist of equality-selection, project and join queries, and show that the complexity of the problem depends crucially on the quality of the estimates that a query optimizer has on the size of the views it is considering to materialize. When a query optimizer has good estimates of the sizes of the views, we show a somewhat surprising result, namely, that an optimal choice of views may involve a number of views that is exponential in the size of the database schema. On the other hand, when an optimizer uses standard estimation heuristics, we show that the number of necessary views and the expression size of each view are polynomially bounded. Received: November 20, 1001 / Accepted: May 30, 2002 / Published online: September 25, 2002 相似文献

16.

ArchIS: an XML-based approach to transaction-time temporal database systems

Fusheng Wang Carlo Zaniolo Xin Zhou 《The VLDB Journal The International Journal on Very Large Data Bases》2008,17(6):1445-1463

Effective support for temporal applications by database systems represents an important technical objective that is difficult to achieve since it requires an integrated solution for several problems, including (i) expressive temporal representations and data models, (ii) powerful languages for temporal queries and snapshot queries, (iii) indexing, clustering and query optimization techniques for managing temporal information efficiently, and (iv) architectures that bring together the different pieces of enabling technology into a robust system. In this paper, we present the ArchIS system that achieves these objectives by supporting a temporally grouped data model on top of RDBMS. ArchIS’ architecture uses (a) XML to support temporally grouped (virtual) representations of the database history, (b) XQuery to express powerful temporal queries on such views, (c) temporal clustering and indexing techniques for managing the actual historical data in a relational database, and (d) SQL/XML for executing the queries on the XML views as equivalent queries on the relational database. The performance studies presented in the paper show that ArchIS is quite effective at storing and retrieving under complex query conditions the transaction-time history of relational databases, and can also assure excellent storage efficiency by providing compression as an option. This approach achieves full-functionality transaction-time databases without requiring temporal extensions in XML or database standards, and provides critical support to emerging application areas such as RFID. 相似文献

17.

基于XML的半结构数据的视图问题研究 总被引：1，自引：0，他引：1

聂培尧李战怀等《计算机科学》2003,30(2):45-48

1 引言数据库中的视图机制主要是根据用户或应用的需要对数据进行剪裁以增加数据库的灵活性。数据库的视图是适合某一特定用户或应用的数据库中部分数据的一种抽象。视图是依照视图声明语言(View Specification Language)来定义的,视图的声明是施加于源数据库(或等价的基数据库)上的。通常,数据库视图既可以是虚拟的(Virtual)、也可以是实际化的相似文献

18.

An inductive database system based on virtual mining views

Hendrik Blockeel Toon Calders élisa Fromont Bart Goethals Adriana Prado Céline Robardet 《Data mining and knowledge discovery》2012,24(1):247-287

Inductive databases integrate database querying with database mining. In this article, we present an inductive database system that does not rely on a new data mining query language, but on plain SQL. We propose an intuitive and elegant framework based on virtual mining views, which are relational tables that virtually contain the complete output of data mining algorithms executed over a given data table. We show that several types of patterns and models that are implicitly present in the data, such as itemsets, association rules, and decision trees, can be represented and queried with SQL using a unifying framework. As a proof of concept, we illustrate a complete data mining scenario with SQL queries over the mining views, which is executed in our system. 相似文献

19.

Incremental Design of a Data Warehouse

Dimitri Theodoratos Timos Sellis 《Journal of Intelligent Information Systems》2000,15(1):7-27

A data warehouse (DW) can be seen as a set of materialized views defined over remote base relations. When a query is posed, it is evaluated locally, using the materialized views, without accessing the original information sources. The DWs are dynamic entities that evolve continuously over time. As time passes, new queries need to be answered by them. Some of these queries can be answered using exclusively the materialized views. In general though new views need to be added to the DW.In this paper we investigate the problem of incrementally designing a DW when new queries need to be answered and possibly extra space is allocated for view materialization. Based on an AND/OR dag representation of multiple queries, we model the problem as a state space search problem. We design incremental algorithms for selecting a set of new views to additionally materialize in the DW that: (a) fits in the extra space, (b) allows a complete rewriting of the new queries over the materialized views, and (c) minimizes the combined new query evaluation and new view maintenance cost. Finally, we discuss methods for pruning the search space so that efficiency is improved. 相似文献

20.

Generalization of strategies for fuzzy query translation in classical relational databases

《Information and Software Technology》2007,49(2):172-180

Users of information systems would like to express flexible queries over the data possibly retrieving imperfect items when the perfect ones, which exactly match the selection conditions, are not available. Most commercial DBMSs are still based on the SQL for querying. Therefore, providing some flexibility to SQL can help users to improve their interaction with the systems without requiring them to learn a completely novel language. Based on the fuzzy set theory and the α-cut operation of fuzzy number, this paper presents the generic fuzzy queries against classical relational databases and develops the translation of the fuzzy queries. The generic fuzzy queries mean that the query condition consists of complex fuzzy terms as the operands and complex fuzzy relations as the operators in a fuzzy query. With different thresholds that the user chooses for the fuzzy query, the user’s fuzzy queries can be translated into precise queries for classical relational databases. 相似文献