首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 593 毫秒
1.
Designing data warehouses   总被引:9,自引:0,他引:9  
A Data Warehouse (DW) is a database that collects and stores data from multiple remote and heterogeneous information sources. When a query is posed, it is evaluated locally, without accessing the original information sources. In this paper we deal with the issue of designing a DW, in the context of the relational model, by selecting a set of views to materialize in the DW. First, we briefly present a theoretical framework for the DW design problem, which concerns the selection of a set of views that (a) fit in the space allocated to the DW, (b) answer all the queries of interest, and (c) minimize the total query evaluation and view maintenance cost. We then formalize the DW design problem as a state space search problem by taking into account multiquery optimization over the maintenance queries (i.e., queries that compute changes to the materialized views) and the use of auxiliary views for reducing the view maintenance cost. Finally, incremental algorithms and heuristics for pruning the search space are presented.  相似文献   

2.
《Information Systems》2001,26(5):363-381
A data warehouse (DW) can be abstractly seen as a set of materialized views defined over a set of remote data sources. A DW is intended to satisfy a set of queries. The views materialized in a DW relate to each other in a complex manner, through common subexpressions, in order to guarantee high query performance and low view maintenance cost. DWs are time varying. As time passes new materialized views are added in order to satisfy new queries, or for performance reasons, while old queries are dropped. The evolution of a DW can result in a redundant set of materialized views. In this paper, we address the problem of detecting redundant materialized views in a given DW view selection, that is, materialized views that can be removed from DW without negatively affecting the query evaluation or the view maintenance process. Using an AND/OR dag representation for multiple queries and views, we first formalize the process of propagating source relation changes to the materialized views by exploiting common subexpressions between views and by using other materialized views that are not affected by these changes. Then, we provide an algorithm for detecting materialized views that are not needed in the process of propagating source relation changes to the DW. We also show how trivially redundant views can be identified in this process. Finally, we use these results to provide a procedure for detecting materialized views that are redundant in a DW. Our approach considers a broad class of views that includes grouping/aggregation views and is not dependent on a specific cost model.  相似文献   

3.
View selection for designing the global data warehouse   总被引:1,自引:0,他引:1  
A global data warehouse (DW) integrates data from multiple distributed heterogeneous databases and other information sources. A global DW can be abstractly seen as a set of materialized views. The selection of views for materialization in a DW is an important decision in the design of a DW. Current commercial products do not provide tools for automatic DW design. We provide a general method that, given a set of select-project-join queries to be satisfied by the DW, generates sets of materialized views that satisfy all the input queries. This process is complex since ‘common subexpressions' between the queries need to be detected and exploited. Our method is then applied to solve the problem of selecting such a materialized view set that fits in the space allocated to the DW for materialization and minimizes the combined overall query evaluation and view maintenance cost. We design algorithms which are implemented and we report on their experimental evaluation.  相似文献   

4.
Web数据集成系统基于QC模型的物化视图选择   总被引:2,自引:0,他引:2  
在Web数据集成系统中,物化视图能够有效地减少网络传输代价,提高系统的查询效率.如何选择查询进行物化,使得选中的查询满足集成层的空间限制,同时获取最大物化收益,成为集成系统中一个迫切需要解决的问题.传统方法没有考虑到海量XML查询之间的包含关系,其选择的物化视图中可能包含冗余的信息.针对上述问题,提出了①Web数据集成系统中海量查询集合的QC(query containment)模型,该模型能够捕捉查询之间最常见的包含关系;②基于QC模型的物化视图选择算法,算法考虑了物化视图选择相关的主要因素,包括查询提交的频率、空间代价、查询重写能力和查询结果的完备性,提出了查询位图的物化视图组织方式,从而获取更加合理的物化视图选择方案.实验结果证明了该方法的有效性.  相似文献   

5.
数据仓库中物化视图选择策略   总被引:2,自引:0,他引:2  
为了提高决策支持和OLAP查询的响应效率,数据仓库多采用物化视图的思想.因此,物化视图的选择策略是数据仓库研究的重要问题之一.其目标是选出一组存储、维护代价与查询代价的总和为最小的物化视图.提出一个以MVPP(multi-view processing plan)为视图选择的搜索空间的物化视图选择新算法--VSMF(views selection base on multi-factor)算法.该算法在存储空间约束下同时实现多查询最优化和视图维护最优化.  相似文献   

6.
OLAP queries involve a lot of aggregations on a large amount of data in data warehouses. To process expensive OLAP queries efficiently, we propose a new method to rewrite a given OLAP query using various kinds of materialized views which already exist in data warehouses. We first define the normal forms of OLAP queries and materialized views based on the selection and aggregation granularities, which are derived from the lattice of dimension hierarchies. Conditions for usability of materialized views in rewriting a given query are specified by relationships between the components of their normal forms. We present a rewriting algorithm for OLAP queries that can effectively utilize materialized views having different selection granularities, selection regions, and aggregation granularities together. We also propose an algorithm to find a set of materialized views that results in a rewritten query which can be executed efficiently. We show the effectiveness and performance of the algorithm experimentally.  相似文献   

7.
一种利用实化视图快速响应查询的技术   总被引:1,自引:0,他引:1       下载免费PDF全文
实化视图可以显著改进查询处理的性能,针对拥有大量实化视图的实际系统,提出了层次索引和视图合并两种方法来有效减少可能被利用的实化视图的搜索空间,还提出了实用的启发式算法以找出较优重写查询。实验表明,所给算法可用来快速地响应查询。  相似文献   

8.
物化视图能够有效地提高空间数据仓库的查询效率,但由于空间操作的复杂性,传统数据仓库中物化视图的选择算法不能很好地应用于空间数据仓库。为了在存储空间约束下选择查询进行物化,并动态调整物化视图集,以适应用户查询的时变性和即席查询,提出了空间物化视图选择算法SMVS。实验结果表明该算法是有效可行的,不仅能够提高查询性能,而且解决了查询响应性能随用户查询分布变化而下降的问题。  相似文献   

9.
一种实化视图的合并算法   总被引:1,自引:0,他引:1  
陈长清  程恳 《计算机应用》2005,25(4):814-816
对于拥有大量实化视图的实际数据库应用系统,提出了视图合并的方法以减少整个视图 的数量,缩减实化视图的搜索空间;还提出了归并树和基于归并树的快速有效的合并算法。实验表 明,实化视图的合并是快速寻找可能响应查询的实化视图的一种有效途径,可以显著改进查询处理的 性能。  相似文献   

10.
分组聚集查询已成为数据仓库领域研究的核心问题之一,实视图是提高分组聚集查询性能的有效手段。利用维属性间的层次关系,对一般意义上的实视图重写查询进行了扩展,讨论了单一视图重写查询的限制条件,并给出重写方法,在此基础上,提出了一种利用多个实视图重写查询的优化选择算法,并通过实验表明,该算法进一步提高了分组聚集查询效率。  相似文献   

11.
物化视图选择的预处理算法   总被引:4,自引:1,他引:4  
现有的静态物化视图选择算法的视图搜索代价较大,而导致算法的时间复杂度偏高,不能用于对物化视图进行在线动态调整.提出了一种物化视图选择的预处理算法——PMVS,其中包括用户查询集动态调整算法QSDM、候选视图格构造算法CVLC和候选视图筛选算法CVF,该算法可用做预处理过程对视图数量进行在线压缩,从而降低了静态算法的视图空间搜索代价和时间复杂度.理论分析和实验结果表明该算法是有效可行的.  相似文献   

12.
View materialization is an effective method to increase query efficiency in a data warehouse and improve OLAP query performance. However, one encounters the problem of space insufficiency if all possible views are materialized in advance. Reducing query time by means of selecting a proper set of materialized views with a lower cost is crucial for efficient data warehousing. In addition, the costs of data warehouse creation, query, and maintenance have to be taken into account while views are materialized. In this paper, we propose efficient algorithms to select a proper set of materialized views, constrained by storage and cost considerations, to help speed up the entire data warehousing process. We derive a cost model for data warehouse query and maintenance as well as efficient view selection algorithms that effectively exploit the gain and loss metrics. The main contribution of our paper is to speed up the selection process of materialized views. Concurrently, this will greatly reduce the overall cost of data warehouse query and maintenance.  相似文献   

13.
数据仓库中物化视图选择算法的代价与搜索空间的尺寸紧密相关。提出了一种基于输入查询的公共子表达式的候选视图搜索空间构造方法IMVPP,利用算法1计算出的公共子表达式,能被其他查询共享,并可对输入查询进行重写,有利于缩减视图搜索空间,提高查询效率。理论分析与实验结果表明,此方法是有效、可行的。  相似文献   

14.
静态物化视图的动态Cache优化算法   总被引:2,自引:0,他引:2  
针对静态物化视图集动态适应能力的不足,提出一种动态cache优化算法DCO(dynamic cacheoptimization).它在保持静态算法获取最优物化集能力的基础上,将cache机制直观、快速的动态特性结合进来,以提高数据仓库的动态自适应性能.在cache机制具体实现中提出了一种新颖的空间申请方法,可以充分利用系统剩余空间提高查询响应性能.实验结果在表明算法有效、可行的同时,也显示出该算法可以在一定程度上克服静态物化集存在的空间-性能饱和效应(space-performance saturation effect,简称SPSE),使通过增加物化空间进一步提高数据仓库对查询的响应速度成为可能.  相似文献   

15.
Emerging applications such as personalized portals, enterprise search, and web integration systems often require keyword search over semi-structured views. However, traditional information retrieval techniques are likely to be expensive in this context because they rely on the assumption that the set of documents being searched is materialized. In this paper, we present a system architecture and algorithm that can efficiently evaluate keyword search queries over virtual (unmaterialized) XML views. An interesting aspect of our approach is that it exploits indices present on the base data and thereby avoids materializing large parts of the view that are not relevant to the query results. Another feature of the algorithm is that by solely using indices, we can still score the results of queries over the virtual view, and the resulting scores are the same as if the view was materialized. Our performance evaluation using the INEX data set in the Quark (Bhaskar et al. in Quark: an efficient XQuery full-text implementation. In: SIGMOD, 2006) open-source XML database system indicates that the proposed approach is scalable and efficient.  相似文献   

16.
基于多维护策略的物化视图选择方法   总被引:1,自引:0,他引:1  
物化视图是数据仓库环境中提高OLAP查询效率的重要手段,因此,物化视图的选择是数据仓库设计中重要的决策之一。本文提出的物化视图选择方法目标是选择合适的视图进行物化,使得查询处理的总代价和物化视图的维护代价最低,提出了物化视图收益模型,并在此基础上基于视图的多维护策略提出了物化视图选择的方法:基于增量和重计算的物化视图选择算法IRMVS、基于增量策略的物化视图选择算法IMVS和基于重计算策略的物化视图选择算法RMVs和基于增量策略的物化后代视图选择算法IMDVS,理论分析和实验表明这些算法是有效可行的。  相似文献   

17.
Answering queries using views: A survey   总被引:25,自引:0,他引:25  
The problem of answering queries using views is to find efficient methods of answering a query using a set of previously defined materialized views over the database, rather than accessing the database relations. The problem has recently received significant attention because of its relevance to a wide variety of data management problems. In query optimization, finding a rewriting of a query using a set of materialized views can yield a more efficient query execution plan. To support the separation of the logical and physical views of data, a storage schema can be described using views over the logical schema. As a result, finding a query execution plan that accesses the storage amounts to solving the problem of answering queries using views. Finally, the problem arises in data integration systems, where data sources can be described as precomputed views over a mediated schema. This article surveys the state of the art on the problem of answering queries using views, and synthesizes the disparate works into a coherent framework. We describe the different applications of the problem, the algorithms proposed to solve it and the relevant theoretical results. Received: 1 August 1999 / Accepted: 23 March 2001 Published online: 6 September 2001  相似文献   

18.
李凌燕  杭晓骏 《微机发展》2007,17(8):130-132
联机分析处理(OLAP)是伴随着数据仓库出现的一种数据分析处理技术,其特点是使分析人员能够更充分地利用数据仓库中的数据资源,从多种角度、多个层次,快速地构建易为用户理解的并全面反映企业行为特征的数据快照,从而可使用户更加深入地了解企业的发展状况和趋势。ROLAP是OLAP中使用最广泛的一种类型。文中对影响ROLAP查询效率的关键技术进行了讨论,提出了一个改进的实视图动态选择算法。该算法从存储空间、查询频率、更新代价三个方面综合评价每个实视图,有效地保证了ROLAP查询的响应时间。  相似文献   

19.
Selection of views to materialize in a data warehouse   总被引:4,自引:0,他引:4  
A data warehouse stores materialized views of data from one or more sources, with the purpose of efficiently implementing decision-support or OLAP queries. One of the most important decisions in designing a data warehouse is the selection of materialized views to be maintained at the warehouse. The goal is to select an appropriate set of views that minimizes total query response time and the cost of maintaining the selected views, given a limited amount of resource, e.g., materialization time, storage space, etc. In This work, we have developed a theoretical framework for the general problem of selection of views in a data warehouse. We present polynomial-time heuristics for a selection of views to optimize total query response time under a disk-space constraint, for some important special cases of the general data warehouse scenario, viz.: 1) an AND view graph, where each query/view has a unique evaluation, e.g., when a multiple-query optimizer can be used to general a global evaluation plan for the queries, and 2) an OR view graph, in which any view can be computed from any one of its related views, e.g., data cubes. We present proofs showing that the algorithms are guaranteed to provide a solution that is fairly close to (within a constant factor ratio of) the optimal solution. We extend our heuristic to the general AND-OR view graphs. Finally, we address in detail the view-selection problem under the maintenance cost constraint and present provably competitive heuristics.  相似文献   

20.
常用OLAP查询优化方法性能分析   总被引:1,自引:0,他引:1  
张银玲  武彤 《微机发展》2014,(1):39-42,46
OLAP(OnlineAnalyticalProcessing)查询常常涉及到不同的维表和事实表,要得到查询结果通常需要进行多张表的连接操作。连接操作是一种非常耗时的操作,因此,如何提高OLAP查询效率成为数据仓库应用中的关键问题。文中对存储过程、索引技术、物化视图等几种常用的OLAP查询优化方法进行性能分析,针对特定应用通过反复实验比较得出物化视图的优越性。而就物化视图而言,其本身有优越性的同时也存在一些缺陷。因此,针对物化视图更新问题提出了几种更新方案。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号