首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
《Information Systems》2001,26(5):363-381
A data warehouse (DW) can be abstractly seen as a set of materialized views defined over a set of remote data sources. A DW is intended to satisfy a set of queries. The views materialized in a DW relate to each other in a complex manner, through common subexpressions, in order to guarantee high query performance and low view maintenance cost. DWs are time varying. As time passes new materialized views are added in order to satisfy new queries, or for performance reasons, while old queries are dropped. The evolution of a DW can result in a redundant set of materialized views. In this paper, we address the problem of detecting redundant materialized views in a given DW view selection, that is, materialized views that can be removed from DW without negatively affecting the query evaluation or the view maintenance process. Using an AND/OR dag representation for multiple queries and views, we first formalize the process of propagating source relation changes to the materialized views by exploiting common subexpressions between views and by using other materialized views that are not affected by these changes. Then, we provide an algorithm for detecting materialized views that are not needed in the process of propagating source relation changes to the DW. We also show how trivially redundant views can be identified in this process. Finally, we use these results to provide a procedure for detecting materialized views that are redundant in a DW. Our approach considers a broad class of views that includes grouping/aggregation views and is not dependent on a specific cost model.  相似文献   

2.
View materialization is an effective method to increase query efficiency in a data warehouse and improve OLAP query performance. However, one encounters the problem of space insufficiency if all possible views are materialized in advance. Reducing query time by means of selecting a proper set of materialized views with a lower cost is crucial for efficient data warehousing. In addition, the costs of data warehouse creation, query, and maintenance have to be taken into account while views are materialized. In this paper, we propose efficient algorithms to select a proper set of materialized views, constrained by storage and cost considerations, to help speed up the entire data warehousing process. We derive a cost model for data warehouse query and maintenance as well as efficient view selection algorithms that effectively exploit the gain and loss metrics. The main contribution of our paper is to speed up the selection process of materialized views. Concurrently, this will greatly reduce the overall cost of data warehouse query and maintenance.  相似文献   

3.
Designing data warehouses   总被引:9,自引:0,他引:9  
A Data Warehouse (DW) is a database that collects and stores data from multiple remote and heterogeneous information sources. When a query is posed, it is evaluated locally, without accessing the original information sources. In this paper we deal with the issue of designing a DW, in the context of the relational model, by selecting a set of views to materialize in the DW. First, we briefly present a theoretical framework for the DW design problem, which concerns the selection of a set of views that (a) fit in the space allocated to the DW, (b) answer all the queries of interest, and (c) minimize the total query evaluation and view maintenance cost. We then formalize the DW design problem as a state space search problem by taking into account multiquery optimization over the maintenance queries (i.e., queries that compute changes to the materialized views) and the use of auxiliary views for reducing the view maintenance cost. Finally, incremental algorithms and heuristics for pruning the search space are presented.  相似文献   

4.
View selection for designing the global data warehouse   总被引:1,自引:0,他引:1  
A global data warehouse (DW) integrates data from multiple distributed heterogeneous databases and other information sources. A global DW can be abstractly seen as a set of materialized views. The selection of views for materialization in a DW is an important decision in the design of a DW. Current commercial products do not provide tools for automatic DW design. We provide a general method that, given a set of select-project-join queries to be satisfied by the DW, generates sets of materialized views that satisfy all the input queries. This process is complex since ‘common subexpressions' between the queries need to be detected and exploited. Our method is then applied to solve the problem of selecting such a materialized view set that fits in the space allocated to the DW for materialization and minimizes the combined overall query evaluation and view maintenance cost. We design algorithms which are implemented and we report on their experimental evaluation.  相似文献   

5.
A data warehouse (DW) can be seen as a set of materialized views defined over remote base relations. When a query is posed, it is evaluated locally, using the materialized views, without accessing the original information sources. The DWs are dynamic entities that evolve continuously over time. As time passes, new queries need to be answered by them. Some of these queries can be answered using exclusively the materialized views. In general though new views need to be added to the DW.In this paper we investigate the problem of incrementally designing a DW when new queries need to be answered and possibly extra space is allocated for view materialization. Based on an AND/OR dag representation of multiple queries, we model the problem as a state space search problem. We design incremental algorithms for selecting a set of new views to additionally materialize in the DW that: (a) fits in the extra space, (b) allows a complete rewriting of the new queries over the materialized views, and (c) minimizes the combined new query evaluation and new view maintenance cost. Finally, we discuss methods for pruning the search space so that efficiency is improved.  相似文献   

6.
The view selection problem is to choose a set of views to materialize over a database schema, such that the cost of evaluating a set of workload queries is minimized and such that the views fit into a prespecified storage constraint. The two main applications of the view selection problem are materializing views in a database to speed up query processing, and selecting views to materialize in a data warehouse to answer decision support queries. In addition, view selection is a core problem for intelligent data placement over a wide-area network for data integration applications and data management for ubiquitous computing. We describe several fundamental results concerning the view selection problem. We consider the problem for views and workloads that consist of equality-selection, project and join queries, and show that the complexity of the problem depends crucially on the quality of the estimates that a query optimizer has on the size of the views it is considering to materialize. When a query optimizer has good estimates of the sizes of the views, we show a somewhat surprising result, namely, that an optimal choice of views may involve a number of views that is exponential in the size of the database schema. On the other hand, when an optimizer uses standard estimation heuristics, we show that the number of necessary views and the expression size of each view are polynomially bounded. Received: November 20, 1001 / Accepted: May 30, 2002 / Published online: September 25, 2002  相似文献   

7.
基于遗传算法的受限物化视图   总被引:2,自引:0,他引:2  
受限物化视图的选择是当前数据仓库研究的最重要的问题之一,且其是个NP问题。本文通过对视图与查询关系的研究,提出了一种选择物化视图的两阶段算法框架,首先利用候选视图选择算法缩小视图选择的范围,而后利用遗传算法解决受限物化视图的选择问题。  相似文献   

8.
Web数据集成系统基于QC模型的物化视图选择   总被引:2,自引:0,他引:2  
在Web数据集成系统中,物化视图能够有效地减少网络传输代价,提高系统的查询效率.如何选择查询进行物化,使得选中的查询满足集成层的空间限制,同时获取最大物化收益,成为集成系统中一个迫切需要解决的问题.传统方法没有考虑到海量XML查询之间的包含关系,其选择的物化视图中可能包含冗余的信息.针对上述问题,提出了①Web数据集成系统中海量查询集合的QC(query containment)模型,该模型能够捕捉查询之间最常见的包含关系;②基于QC模型的物化视图选择算法,算法考虑了物化视图选择相关的主要因素,包括查询提交的频率、空间代价、查询重写能力和查询结果的完备性,提出了查询位图的物化视图组织方式,从而获取更加合理的物化视图选择方案.实验结果证明了该方法的有效性.  相似文献   

9.
实体化视图是数据仓库中提高查询效率的有效手段,数据仓库运行期间,需要对其中的实体化视图进行维护,从而保证用户查询的响应时间较短。针对用于实体化视图动态选择的遗传算法收敛速度慢,运行时间长的问题,提出一种预处理算法来计算动态选择实体化视图时遗传算法的初始群体。理论分析和宴验结果表明,该算法可以有效地提高实体化视图动态选择时的寻优收敛速度。  相似文献   

10.
数据仓库中物化视图选择策略   总被引:2,自引:0,他引:2  
为了提高决策支持和OLAP查询的响应效率,数据仓库多采用物化视图的思想.因此,物化视图的选择策略是数据仓库研究的重要问题之一.其目标是选出一组存储、维护代价与查询代价的总和为最小的物化视图.提出一个以MVPP(multi-view processing plan)为视图选择的搜索空间的物化视图选择新算法--VSMF(views selection base on multi-factor)算法.该算法在存储空间约束下同时实现多查询最优化和视图维护最优化.  相似文献   

11.
物化视图选择是数据仓库研究领域的一个重要课题,其选择策略直接影响到数据仓库的查询效率.通过对超市数据仓库的设计及已有研究成果的分析,对物化视图的选择算法做了一些改进,并给出了一种据查询情况的变化动态调整物化视图集的算法.  相似文献   

12.
OLAP queries involve a lot of aggregations on a large amount of data in data warehouses. To process expensive OLAP queries efficiently, we propose a new method to rewrite a given OLAP query using various kinds of materialized views which already exist in data warehouses. We first define the normal forms of OLAP queries and materialized views based on the selection and aggregation granularities, which are derived from the lattice of dimension hierarchies. Conditions for usability of materialized views in rewriting a given query are specified by relationships between the components of their normal forms. We present a rewriting algorithm for OLAP queries that can effectively utilize materialized views having different selection granularities, selection regions, and aggregation granularities together. We also propose an algorithm to find a set of materialized views that results in a rewritten query which can be executed efficiently. We show the effectiveness and performance of the algorithm experimentally.  相似文献   

13.
数据仓库通常要对大量的数据进行运算,以精简的结果来回答用户的查询,这一特点使得物化视图技术在数据仓库中尤为重要.然而现有支持物化视图自动选择的方法是静态的,它违背了联机分析处理(OLAP)和决策支持系统(DSS)的动态本质.本文提出了可扩展的动态物化视图方法,通过将整个物化视图选择问题(MVS)分解为三个阶段,降低了问题的复杂度,提高了物化视图的有效性.通过动态调整,物化视图能即时适应查询需求.算法复杂度分析证明了方案的可扩展性.动态调整算法模拟实验验证了方案具有很好的自适应性.  相似文献   

14.
分组聚集查询已成为数据仓库领域研究的核心问题之一,实视图是提高分组聚集查询性能的有效手段。利用维属性间的层次关系,对一般意义上的实视图重写查询进行了扩展,讨论了单一视图重写查询的限制条件,并给出重写方法,在此基础上,提出了一种利用多个实视图重写查询的优化选择算法,并通过实验表明,该算法进一步提高了分组聚集查询效率。  相似文献   

15.
数据仓库中用存储大量的物化视图来加速OLAP的查询响应,物化视图的选取是数据仓库设计中的一个重要问题。论文提出了一个有效的物化视图选取算法,采用基于数据立方体层次搜索的方式选取视图。经分析与测试表明,该算法取得良好的效果和效率。  相似文献   

16.
实视图选择问题是数据仓库研究的重要问题之一。数据仓库存储实视图主要为OLAP查询,用户查询响应时间是首要考虑的问题,提出了查询代价视图选择问题,给出了其代价模型。提出了对查询代价视图选择问题利用遗传算法来解决的方法和策略。经实验证明,该算法达到了良好的效果,效率高。  相似文献   

17.
NDSMMV——一种多维数据集物化视图动态选择新策略   总被引:2,自引:0,他引:2  
物化视图的选择策略是数据仓库研究的重要问题之一.通过深入研究提出了一种多维数据集中物化视图动态选择的新策略--NDSMMV,包括候选视图生成算法CVGA、物化视图选择算法IGA、物化视图调整算法MAMV和物化视图动态调整算法DMAMV.CVGA基于多维数据格生成候选视图集,对候选视图数量进行压缩以减少后续算法的视图空间搜索代价和时间复杂度;IGA基于视图查询、视图维护和存储空间三元评价标准在候选视图集上进行物化视图的选择;MAMV基于物化视图选择过程已选视图的收益变化情况对物化视图进行进一步调整以提高查询的响应性能;DMAMV定时地判断查询视图类型分布是否变化来决定是否进行物化视图的动态调整,从而避免了物化视图集的"抖动".理论分析和实验结果表明该策略是有效可行的.  相似文献   

18.
We consider a workload of aggregate queries and investigate the problem of selecting materialized views that (1) provide equivalent rewritings for all the queries, and (2) are optimal, in that the cost of evaluating the query workload is minimized. We consider conjunctive views and rewritings, with or without aggregation; in each rewriting, only one view contributes to computing the aggregated query output. We look at query rewriting using existing views and at view selection. In the query-rewriting problem, we give sufficient and necessary conditions for a rewriting to exist. For view selection, we prove complexity results. Finally, we give algorithms for obtaining rewritings and selecting views.  相似文献   

19.
基于多维护策略的物化视图选择方法   总被引:1,自引:0,他引:1  
物化视图是数据仓库环境中提高OLAP查询效率的重要手段,因此,物化视图的选择是数据仓库设计中重要的决策之一。本文提出的物化视图选择方法目标是选择合适的视图进行物化,使得查询处理的总代价和物化视图的维护代价最低,提出了物化视图收益模型,并在此基础上基于视图的多维护策略提出了物化视图选择的方法:基于增量和重计算的物化视图选择算法IRMVS、基于增量策略的物化视图选择算法IMVS和基于重计算策略的物化视图选择算法RMVs和基于增量策略的物化后代视图选择算法IMDVS,理论分析和实验表明这些算法是有效可行的。  相似文献   

20.
物化视图能够有效地提高空间数据仓库的查询效率,但由于空间操作的复杂性,传统数据仓库中物化视图的选择算法不能很好地应用于空间数据仓库。为了在存储空间约束下选择查询进行物化,并动态调整物化视图集,以适应用户查询的时变性和即席查询,提出了空间物化视图选择算法SMVS。实验结果表明该算法是有效可行的,不仅能够提高查询性能,而且解决了查询响应性能随用户查询分布变化而下降的问题。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号