期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Efficient approaches for materialized views selection in a data warehouse

Ming-Chuan Hung Man-Lin Huang Nien-Lin Hsueh 《Information Sciences》2007,177(6):1333-1348

View materialization is an effective method to increase query efficiency in a data warehouse and improve OLAP query performance. However, one encounters the problem of space insufficiency if all possible views are materialized in advance. Reducing query time by means of selecting a proper set of materialized views with a lower cost is crucial for efficient data warehousing. In addition, the costs of data warehouse creation, query, and maintenance have to be taken into account while views are materialized. In this paper, we propose efficient algorithms to select a proper set of materialized views, constrained by storage and cost considerations, to help speed up the entire data warehousing process. We derive a cost model for data warehouse query and maintenance as well as efficient view selection algorithms that effectively exploit the gain and loss metrics. The main contribution of our paper is to speed up the selection process of materialized views. Concurrently, this will greatly reduce the overall cost of data warehouse query and maintenance. 相似文献

2.

An efficient method for maintaining data cubes incrementally

Ki Yong Lee Yon Dohn Chung 《Information Sciences》2010,180(6):928-2059

The data cube operator computes group-bys for all possible combinations of a set of dimension attributes. Since computing a data cube typically incurs a considerable cost, the data cube is often precomputed and stored as materialized views in data warehouses. A materialized data cube needs to be updated when the source relations are changed. The incremental maintenance of a data cube is to compute and propagate only its changes, rather than recompute the entire data cube from scratch. For n dimension attributes, the data cube consists of 2ⁿ group-bys, each of which is called a cuboid. To incrementally maintain a data cube with 2ⁿ cuboids, the conventional methods compute 2ⁿdelta cuboids, each of which represents the change of a cuboid. In this paper, we propose an efficient incremental maintenance method that can maintain a data cube using only a subset of 2ⁿ delta cuboids. We formulate an optimization problem to find the optimal subset of 2ⁿ delta cuboids that minimizes the total maintenance cost, and propose a heuristic solution that allows us to maintain a data cube using only delta cuboids. As a result, the cost of maintaining a data cube is substantially reduced. Through various experiments, we show the performance advantages of the proposed method over the conventional methods. We also extend the proposed method to handle partially materialized cubes and dimension hierarchies. 相似文献

3.

Extending OCL for OLAP querying on conceptual multidimensional models of data warehouses

Jesús Pardillo Jose-Norberto Mazón 《Information Sciences》2010,180(5):584-5028

The development of data warehouses begins with the definition of multidimensional models at the conceptual level in order to structure data, which will facilitate decision makers with an easier data analysis. Current proposals for conceptual multidimensional modelling focus on the design of static data warehouse structures, but few approaches model the queries which the data warehouse should support by means of OLAP (on-line analytical processing) tools. OLAP queries are, therefore, only defined once the rest of the data warehouse has been implemented, which prevents designers from verifying from the very beginning of the development whether the decision maker will be able to obtain the required information from the data warehouse. This article presents a solution to this drawback consisting of an extension to the object constraint language (OCL), which has been developed to include a set of predefined OLAP operators. These operators can be used to define platform-independent OLAP queries as a part of the specification of the data warehouse conceptual multidimensional model. Furthermore, OLAP tools require the implementation of queries to assure performance optimisations based on pre-aggregation. It is interesting to note that the OLAP queries defined by our approach can be automatically implemented in the rest of the data warehouse, in a coherent and integrated manner. This implementation is supported by a code-generation architecture aligned with model-driven technologies, in particular the MDA (model-driven architecture) proposal. Finally, our proposal has been validated by means of a set of sample data sets from a well-known case study. 相似文献

4.

Finding an application-appropriate model for XML data warehouses

Franck Ravat Olivier Teste Ronan Tournier Gilles Zurfluh 《Information Systems》2010

Decision support systems help the decision making process with the use of OLAP (On-Line Analytical Processing) and data warehouses. These systems allow the analysis of corporate data. As OLAP and data warehousing evolve, more and more complex data is being used. XML (Extensible Markup Language) is a flexible text format allowing the interchange and the representation of complex data. Finding an appropriate model for an XML data warehouse tends to become complicated as more and more solutions appear. Hence, in this survey paper we present an overview of the different proposals that use XML within data warehousing technology. These proposals range from using XML data sources for regular warehouses to those using full XML warehousing solutions. Some researches merely focus on document storage facilities while others present adaptations of XML technology for OLAP. Even though there are a growing number of researches on the subject, many issues still remain unsolved. 相似文献

5.

Active data warehouses: complementing OLAP with analysis rules 总被引：2，自引：0，他引：2

Thomas Michael Mukesh 《Data & Knowledge Engineering》2001,39(3):241-269

Conventional data warehouses are passive. All tasks related to analysing data and making decisions must be carried out manually by analysts. Today's data warehouse and OLAP systems offer little support to automatize decision tasks that occur frequently and for which well-established decision procedures are available. Such a functionality can be provided by extending the conventional data warehouse architecture with analysis rules, which mimic the work of an analyst during decision making. Analysis rules extend the basic event/condition/action (ECA) rule structure with mechanisms to analyse data multidimensionally and to make decisions. The resulting architecture is called active data warehouse. 相似文献

6.

Designing data warehouses 总被引：9，自引：0，他引：9

Dimitri Timos 《Data & Knowledge Engineering》1999,31(3):279-301

A Data Warehouse (DW) is a database that collects and stores data from multiple remote and heterogeneous information sources. When a query is posed, it is evaluated locally, without accessing the original information sources. In this paper we deal with the issue of designing a DW, in the context of the relational model, by selecting a set of views to materialize in the DW. First, we briefly present a theoretical framework for the DW design problem, which concerns the selection of a set of views that (a) fit in the space allocated to the DW, (b) answer all the queries of interest, and (c) minimize the total query evaluation and view maintenance cost. We then formalize the DW design problem as a state space search problem by taking into account multiquery optimization over the maintenance queries (i.e., queries that compute changes to the materialized views) and the use of auxiliary views for reducing the view maintenance cost. Finally, incremental algorithms and heuristics for pruning the search space are presented. 相似文献

7.

Data mining-based materialized view and index selection in data warehouses

Kamel Aouiche Jérôme Darmont 《Journal of Intelligent Information Systems》2009,33(1):65-93

Materialized views and indexes are physical structures for accelerating data access that are casually used in data warehouses. However, these data structures generate some maintenance overhead. They also share the same storage space. Most existing studies about materialized view and index selection consider these structures separately. In this paper, we adopt the opposite stance and couple materialized view and index selection to take view–index interactions into account and achieve efficient storage space sharing. Candidate materialized views and indexes are selected through a data mining process. We also exploit cost models that evaluate the respective benefit of indexing and view materialization, and help select a relevant configuration of indexes and materialized views among the candidates. Experimental results show that our strategy performs better than an independent selection of materialized views and indexes. 相似文献

8.

Determinacy and query rewriting for conjunctive queries and views

Foto N. Afrati 《Theoretical computer science》2011,412(11):1005-1021

Answering queries using views is the problem which examines how to derive the answers to a query when we only have the answers to a set of views. Constructing rewritings is a widely studied technique to derive those answers. In this paper we consider the problem of the existence of rewritings in the case where the answers to the views uniquely determine the answers to the query. Specifically, we say that a view set Vdetermines a query Q if for any two databases D₁,D₂ it holds: V(D₁)=V(D₂) implies Q(D₁)=Q(D₂). We consider the case where query and views are defined by conjunctive queries and investigate the question: If a view set V determines a query Q, is there an equivalent rewriting of Q using V? We present here interesting cases where there are such rewritings in the language of conjunctive queries. Interestingly, we identify a class of conjunctive queries, CQ_path, for which a view set can produce equivalent rewritings for “almost all” queries which are determined by this view set. We introduce a problem which relates determinacy to query equivalence. We show that there are cases where restricted results can carry over to broader classes of queries. 相似文献

9.

Efficient evaluation of query rewriting plan over materialized XML view

Jun Gao Author Vitae Jiaheng Lu Author Vitae Dongqing Yang Author Vitae 《Journal of Systems and Software》2010,83(6):1029-1038

The query rewriting plan generation over XML views has received wide attention recently. However, little work has been done on efficient evaluation of the query rewriting plans, which is not trivial since the plan may contain an exponential size of sub-plans. This paper investigates the reason for the potentially exponential number of sub-plans, and then proposes a new space-efficient form called ABCPlan (Plan with Automata Based Combinations) to equivalently represent the original query rewriting plan. ABCPlan contains a set of buckets containing suffix paths in the query tree and an automata to indicate the combination of the suffix paths from different buckets as valid query rewriting sub-plans. We also design an evaluation method called ABCScan, which constructs a unified evaluation tree for the ABCPlan and handles the evaluation tree in one scan of the XML view. In the evaluation, we introduce node existence automata to encode the structure of the sub-tree and convert the satisfaction of the ABCPlan into the intersection problem of deterministic finite automata. The experiments show that ABCPlan based method outperforms existing methods significantly in terms of scalability and efficiency. 相似文献

10.

动态更新实物化视图以提高OLAP查询效率

武彤赵雪赵洵《计算机科学》2012,39(105):315-317

在数据仓库系统中,OLAP查询一般都涉及多表连接和分组聚集两部分操作,提高这些查询的性能成为提高OLAP响应速度的关键。利用实物化视图,可以准确地计算并保存表连接或聚集等耗时较多的操作的结果。研究基于查询频率的实物化视图的更新算法,可以使实物化视图得到最大效率的使用,明显地缩短查询的响应时间,从而提高OLAP的查询效率。相似文献

11.

Answering queries using materialized views with minimum size

Rada Chirkova Chen Li Jia Li 《The VLDB Journal The International Journal on Very Large Data Bases》2006,15(3):191-210

In this paper, we study the following problem. Given a database and a set of queries, we want to find a set of views that can compute the answers to the queries, such that the amount of space, in bytes, required to store the viewset is minimum on the given database. (We also handle problem instances where the input has a set of database instances, as described by an oracle that returns the sizes of view relations for given view definitions.) This problem is important for applications such as distributed databases, data warehousing, and data integration. We explore the decidability and complexity of the problem for workloads of conjunctive queries. We show that results differ significantly depending on whether the workload queries have self-joins. Further, for queries without self-joins we describe a very compact search space of views, which contains all views in at least one optimal viewset. We present techniques for finding a minimum-size viewset for a single query without self-joins by using the shape of the query and its constraints, and validate the approach by extensive experiments. Part of this article was published elsewhere [Chirkova, R., Li, C.: Materializing views with minimal size to answer queries. PODS (2003)]. In addition to the prior materials, this article contains new theoretical results, as well as new results on how to efficiently implement the proposed techniques (Sects. 5 and 5.4) 相似文献

12.

The automatic creation of OLAP cube using an MDA approach

下载免费PDF全文

Khadija Letrache Omar El Beggar Mohammed Ramdani 《Software》2017,47(12):1887-1903

The Model‐Driven Architecture (MDA) is an approach that aligns modeling and automation for software development. By applying such an approach to data warehouse (DW) projects, we can minimize a great deal of time and cost. Furthermore, most of OnLine Analytical Processing (OLAP) platforms seem to be like black boxes that provide wizards only to business intelligence developers to create and manipulate OLAP objects without allowing their sustainability and migration from a platform to another. That is why many works in the literature have proposed using the MDA approach in DW projects. However, most of them have mainly focused on the generation of the DW relational model from the conceptual one, and they overlooked the OLAP model and the cube implementation. To deal with this problem, we propose in this paper an MDA solution to automate the process of getting OLAP cube and its implementation through a set of metamodels and automatic transformations among them. In fact, the proposal generates the OLAP and DW relational models (PSMs) from the conceptual one, using also a PDM model that describes the target business intelligence platform. After that, the source code to create the cube is got from both PSM models. For this aim, we define a set of transformation rules implemented using the Atlas transformation language. Finally, a case study will be provided to validate our approach. 相似文献

13.

Incremental maintenance of object-oriented data warehouses

Ching-Ming Chao 《Information Sciences》2004,160(1-4):91-110

Incremental maintenance of data warehouses has attracted a lot of research attention for the past few years. Nevertheless, most of the previous work is confined to the relational setting. Recently, object-oriented data warehouses have been regarded as a better means to integrate data from modern heterogeneous data sources. However, existing approaches to incremental maintenance of data warehouses do not directly apply to object-oriented data warehouses. In this paper, therefore, we propose an approach to incremental maintenance of object-oriented data warehouses. We focus on two primary issues specifically. First, we identify six categories of potential updates to an object-oriented view and propose an algorithm to find potential updates from the definition of the view. Second, we propose an incremental view maintenance algorithm for maintaining object-oriented data warehouses. We have implemented a prototype system for incremental maintenance of object-oriented data warehouses. Performance evaluation has been conducted, which indicates that our approach is correct and efficient. 相似文献

14.

数据仓库中物化视图选择策略 总被引：2，自引：0，他引：2

林小静薛永生《计算机工程与设计》2007,28(13):3056-3059

为了提高决策支持和OLAP查询的响应效率,数据仓库多采用物化视图的思想.因此,物化视图的选择策略是数据仓库研究的重要问题之一.其目标是选出一组存储、维护代价与查询代价的总和为最小的物化视图.提出一个以MVPP(multi-view processing plan)为视图选择的搜索空间的物化视图选择新算法--VSMF(views selection base on multi-factor)算法.该算法在存储空间约束下同时实现多查询最优化和视图维护最优化. 相似文献

15.

实视图选择中的查询分布统计

秦智平袁捷徐忠健《计算机工程》2004,30(3):99-100,103

研究了在系统运行过程中统计查询分布的方法,分析了直接统计时可能遇到的困难,为此提出了查询泛化的疗法,并给出查询泛化的两种算法,对将实视图选择理论用于数据仓库工程实践有一定的参考价值。相似文献

16.

Rewriting queries using views with access patterns under integrity constraints

Alin Deutsch Bertram Ludäscher Alan Nash 《Theoretical computer science》2007

We study the problem of rewriting queries using views in the presence of access patterns, integrity constraints, disjunction and negation. We provide asymptotically optimal algorithms for (1) finding minimally containing and (2) maximally contained rewritings respecting the access patterns (which we call executable) and for (3) deciding whether an exact executable rewriting exists. We show that rewriting queries using views in this case reduces (a) to rewriting queries with access patterns and constraints without views and also (b) to rewriting queries using views under constraints without access patterns. We show how to solve (a) directly and how to reduce (b) to rewriting queries under constraints only (semantic optimization). These reductions provide two separate routes to a unified solution for problems 1, 2 and 3 based on an extension of the relational chase theory to queries and constraints with disjunction and negation. We also handle equality and arithmetic comparisons. We also show that in an information integration setting, maximally contained rewritings are given by the certain answers (under the usual semantics) for a set of constraints derived from the binding patterns. That is, except for defining the appropriate constraints, binding patterns do not need special treatment. Finally, we show that if there is an exact executable rewriting, there is an executable rewriting which is a union of conjunctive queries with negation. 相似文献

17.

Answering constraint-based mining queries on itemsets using previous materialized results

Roberto Esposito Rosa Meo Marco Botta 《Journal of Intelligent Information Systems》2006,26(1):95-111

In recent years, researchers have begun to study inductive databases, a new generation of databases for leveraging decision support applications. In this context, the user interacts with the DBMS using advanced, constraint-based languages for data mining where constraints have been specifically introduced to increase the relevance of the results and, at the same time, to reduce its volume. In this paper we study the problem of mining frequent itemsets using an inductive database. We propose a technique for query answering which consists in rewriting the query in terms of union and intersection of the result sets of other queries, previously executed and materialized. Unfortunately, the exploitation of past queries is not always applicable. We then present sufficient conditions for the optimization to apply and show that these conditions are strictly connected with the presence of functional dependencies between the attributes involved in the queries. We show some experiments on an initial prototype of an optimizer which demonstrates that this approach to query answering is viable and in many practical cases it drastically reduces the query execution time. 相似文献

18.

MiniCon: A scalable algorithm for answering queries using views 总被引：5，自引：0，他引：5

Rachel Pottinger Alon Halevy 《The VLDB Journal The International Journal on Very Large Data Bases》2001,10(2-3):182-198

The problem of answering queries using views is to find efficient methods of answering a query using a set of previously materialized views over the database, rather than accessing the database relations. The problem has received significant attention because of its relevance to a wide variety of data management problems, such as data integration, query optimization, and the maintenance of physical data independence. To date, the performance of proposed algorithms has received very little attention, and in particular, their scale up in the presence of a large number of views is unknown. We first analyze two previous algorithms, the bucket algorithm and the inverse-rules, and show their deficiencies. We then describe the MiniCon, a novel algorithm for finding the maximally-contained rewriting of a conjunctive query using a set of conjunctive views. We present the first experimental study of algorithms for answering queries using views. The study shows that the MiniCon scales up well and significantly outperforms the previous algorithms. We describe an extension of the MiniCon to handle comparison predicates, and show its performance experimentally. Finally, we describe how the MiniCon can be extended to the context of query optimization. Received: 15 October 2000 / Accepted: 15 April 2001 Published online: 28 June 2001 相似文献

19.

Monotonic complements for independent data warehouses

D. Laurent J. Lechtenbörger N. Spyratos G. Vossen 《The VLDB Journal The International Journal on Very Large Data Bases》2001,10(4):295-315

Views over databases have regained attention in the context of data warehouses, which are seen as materialized views. In this setting, efficient view maintenance is an important issue, for which the notion of self-maintainability has been identified as desirable. In this paper, we extend the concept of self-maintainability to (query and update) independence within a formal framework, where independence with respect to arbitrary given sets of queries and updates over the sources can be guaranteed. To this end we establish an intuitively appealing connection between warehouse independence and view complements. Moreover, we study special kinds of complements, namely monotonic complements, and show how to compute minimal ones in the presence of keys and foreign keys in the underlying databases. Taking advantage of these complements, an algorithmic approach is proposed for the specification of independent warehouses with respect to given sets of queries and updates. Received: 21 November 2000 / Accepted: 1 May 2001 Published online: 6 September 2001 相似文献

20.

Answering queries using views: A survey 总被引：25，自引：0，他引：25

Alon Y. Halevy 《The VLDB Journal The International Journal on Very Large Data Bases》2001,10(4):270-294

The problem of answering queries using views is to find efficient methods of answering a query using a set of previously defined materialized views over the database, rather than accessing the database relations. The problem has recently received significant attention because of its relevance to a wide variety of data management problems. In query optimization, finding a rewriting of a query using a set of materialized views can yield a more efficient query execution plan. To support the separation of the logical and physical views of data, a storage schema can be described using views over the logical schema. As a result, finding a query execution plan that accesses the storage amounts to solving the problem of answering queries using views. Finally, the problem arises in data integration systems, where data sources can be described as precomputed views over a mediated schema. This article surveys the state of the art on the problem of answering queries using views, and synthesizes the disparate works into a coherent framework. We describe the different applications of the problem, the algorithms proposed to solve it and the relevant theoretical results. Received: 1 August 1999 / Accepted: 23 March 2001 Published online: 6 September 2001 相似文献