首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到10条相似文献,搜索用时 140 毫秒
1.
A data warehouse (DW) can be seen as a set of materialized views defined over remote base relations. When a query is posed, it is evaluated locally, using the materialized views, without accessing the original information sources. The DWs are dynamic entities that evolve continuously over time. As time passes, new queries need to be answered by them. Some of these queries can be answered using exclusively the materialized views. In general though new views need to be added to the DW.In this paper we investigate the problem of incrementally designing a DW when new queries need to be answered and possibly extra space is allocated for view materialization. Based on an AND/OR dag representation of multiple queries, we model the problem as a state space search problem. We design incremental algorithms for selecting a set of new views to additionally materialize in the DW that: (a) fits in the extra space, (b) allows a complete rewriting of the new queries over the materialized views, and (c) minimizes the combined new query evaluation and new view maintenance cost. Finally, we discuss methods for pruning the search space so that efficiency is improved.  相似文献   

2.
《Information Systems》2001,26(5):363-381
A data warehouse (DW) can be abstractly seen as a set of materialized views defined over a set of remote data sources. A DW is intended to satisfy a set of queries. The views materialized in a DW relate to each other in a complex manner, through common subexpressions, in order to guarantee high query performance and low view maintenance cost. DWs are time varying. As time passes new materialized views are added in order to satisfy new queries, or for performance reasons, while old queries are dropped. The evolution of a DW can result in a redundant set of materialized views. In this paper, we address the problem of detecting redundant materialized views in a given DW view selection, that is, materialized views that can be removed from DW without negatively affecting the query evaluation or the view maintenance process. Using an AND/OR dag representation for multiple queries and views, we first formalize the process of propagating source relation changes to the materialized views by exploiting common subexpressions between views and by using other materialized views that are not affected by these changes. Then, we provide an algorithm for detecting materialized views that are not needed in the process of propagating source relation changes to the DW. We also show how trivially redundant views can be identified in this process. Finally, we use these results to provide a procedure for detecting materialized views that are redundant in a DW. Our approach considers a broad class of views that includes grouping/aggregation views and is not dependent on a specific cost model.  相似文献   

3.
Designing data warehouses   总被引:9,自引:0,他引:9  
A Data Warehouse (DW) is a database that collects and stores data from multiple remote and heterogeneous information sources. When a query is posed, it is evaluated locally, without accessing the original information sources. In this paper we deal with the issue of designing a DW, in the context of the relational model, by selecting a set of views to materialize in the DW. First, we briefly present a theoretical framework for the DW design problem, which concerns the selection of a set of views that (a) fit in the space allocated to the DW, (b) answer all the queries of interest, and (c) minimize the total query evaluation and view maintenance cost. We then formalize the DW design problem as a state space search problem by taking into account multiquery optimization over the maintenance queries (i.e., queries that compute changes to the materialized views) and the use of auxiliary views for reducing the view maintenance cost. Finally, incremental algorithms and heuristics for pruning the search space are presented.  相似文献   

4.
Selection of views to materialize in a data warehouse   总被引:4,自引:0,他引:4  
A data warehouse stores materialized views of data from one or more sources, with the purpose of efficiently implementing decision-support or OLAP queries. One of the most important decisions in designing a data warehouse is the selection of materialized views to be maintained at the warehouse. The goal is to select an appropriate set of views that minimizes total query response time and the cost of maintaining the selected views, given a limited amount of resource, e.g., materialization time, storage space, etc. In This work, we have developed a theoretical framework for the general problem of selection of views in a data warehouse. We present polynomial-time heuristics for a selection of views to optimize total query response time under a disk-space constraint, for some important special cases of the general data warehouse scenario, viz.: 1) an AND view graph, where each query/view has a unique evaluation, e.g., when a multiple-query optimizer can be used to general a global evaluation plan for the queries, and 2) an OR view graph, in which any view can be computed from any one of its related views, e.g., data cubes. We present proofs showing that the algorithms are guaranteed to provide a solution that is fairly close to (within a constant factor ratio of) the optimal solution. We extend our heuristic to the general AND-OR view graphs. Finally, we address in detail the view-selection problem under the maintenance cost constraint and present provably competitive heuristics.  相似文献   

5.
View materialization is one of the most important techniques applied in multidimensional databases. The problem of selecting a set of views for materialization that minimizes queries response time under storage space constraint received significant attention over last twenty years. Many researchers concentrate on designing better view selection methods with respect to the running time or the cost of the solution. This paper summarizes our research on the problem of how much space should be allocated for views materialization to ensure good queries performance. In order to comprehensively investigate the problem and minimize the influence of untypical cases, the experiments described in this paper were done on the large data set, including large data cubes, rarely considered in previous papers. In particular, the relation between the number of data cube views and the space limit expressed as a percentage of the fully materialized data cube size and a multiple of the base view size is analysed. According to our experimental results, the allocation of large space for views materialization is not cost effective.  相似文献   

6.
实体化视图是数据仓库中提高查询效率的有效手段,数据仓库运行期间,需要对其中的实体化视图进行维护,从而保证用户查询的响应时间较短。针对用于实体化视图动态选择的遗传算法收敛速度慢,运行时间长的问题,提出一种预处理算法来计算动态选择实体化视图时遗传算法的初始群体。理论分析和宴验结果表明,该算法可以有效地提高实体化视图动态选择时的寻优收敛速度。  相似文献   

7.
《Information Systems》2001,26(5):323-362
We consider a variant of the view maintenance problem: How does one keep a materialized view up-to-date when the view definition itself changes? Can one do better than recomputing the view from the base relations? Traditional view maintenance tries to maintain the materialized view in response to modifications to the base relations; we try to “adapt” the view in response to changes in the view definition.Such techniques are needed for applications where the user can change queries dynamically and wants to see the changes in the results fast. Data archaeology, data visualization, and dynamic queries are examples of such applications. Views defined over the Internet tend to evolve and our technique can be useful for adapting such views.We consider all possible redefinitions of SQL SELECT-FROM-WHERE-GROUP-BY-HAVING, UNION, and EXCEPT views, and show how these views can be adapted using the old materialization for the cases where it is possible to do so. We identify extra information that can be kept with a materialization to facilitate redefinition. Multiple simultaneous changes to a view can be handled without necessarily materializing intermediate results. We identify guidelines for users and database administrators that can be used to facilitate efficient view adaptation.We perform a systematic experimental evaluation of our proposed techniques. Our evaluation indicates that adaptation is much more efficient than rematerialization in most cases. In-place adaptation methods are better than the non-in-place methods when the change is small. We also point out some important factors that can impact the efficiency of adaptation.  相似文献   

8.
View adaptation relies on adapting a set of materialized views in response to schema changes of source relations and/or after view redefinition. Recently, several view selection methods that are based on materializing fragments of the view rather than the whole view have been proposed. We call this approach the fragment-based approach. This paper presents a view adaptation method in the fragment-based approach, which is aimed at exploiting the opportunities to share not only materialized data, but also computation between the different views. In order to do this, the views are modeled using the so-called multiview materialization graph, which represents the views as a bipartite directed acyclic graph whose nodes are operations and fragments of the views. Then, the adaptation is performed regarding all materialized views and not solely the old materialization of the view. However, the data independence is preserved for the views that are not affected by the change. On the contrary, in related work, the adaptation technique is based solely on the old materialization of the same view. We studied the impact of the fragmentation on the adaptation techniques and showed the advantages and drawbacks of this approach.  相似文献   

9.
基于XML的半结构数据的视图问题研究   总被引:1,自引:0,他引:1  
1 引言数据库中的视图机制主要是根据用户或应用的需要对数据进行剪裁以增加数据库的灵活性。数据库的视图是适合某一特定用户或应用的数据库中部分数据的一种抽象。视图是依照视图声明语言(View Specification Language)来定义的,视图的声明是施加于源数据库(或等价的基数据库)上的。通常,数据库视图既可以是虚拟的(Virtual)、也可以是实际化的  相似文献   

10.
View materialization is an effective method to increase query efficiency in a data warehouse and improve OLAP query performance. However, one encounters the problem of space insufficiency if all possible views are materialized in advance. Reducing query time by means of selecting a proper set of materialized views with a lower cost is crucial for efficient data warehousing. In addition, the costs of data warehouse creation, query, and maintenance have to be taken into account while views are materialized. In this paper, we propose efficient algorithms to select a proper set of materialized views, constrained by storage and cost considerations, to help speed up the entire data warehousing process. We derive a cost model for data warehouse query and maintenance as well as efficient view selection algorithms that effectively exploit the gain and loss metrics. The main contribution of our paper is to speed up the selection process of materialized views. Concurrently, this will greatly reduce the overall cost of data warehouse query and maintenance.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号