期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Using object deputy model to prepare data for data warehousing

Zhiyong Peng Qing Li Feng L. Xuhui Li Junqiang Liu 《Knowledge and Data Engineering, IEEE Transactions on》2005,17(9):1274-1288

Providing integrated access to multiple, distributed, heterogeneous databases and other information sources has become one of the leading issues in database research and the industry. One of the most effective approaches is to extract and integrate information of interest from each source in advance and store them in a centralized repository (known as a data warehouse). When a query is posed, it is evaluated directly at the warehouse without accessing the original information sources. One of the techniques that this approach uses to improve the efficiency of query processing is materialized view(s). Essentially, materialized views are used for data warehouses, and various methods for relational databases have been developed. In this paper, we first discuss an object deputy approach to realize materialized object views for data warehouses which can also incorporate object-oriented databases. A framework has been developed using Smalltalk to prepare data for data warehousing, in which an object deputy model and database connecting tools have been implemented. The object deputy model can provide an easy-to-use way to resolve inconsistency and conflicts while preparing data for data warehousing, as evidenced by our empirical study. 相似文献

2.

Synchronous incremental update of materialized views for PostgreSQL

Nguyen Tran Quoc Vinh 《Programming and Computer Software》2016,42(5):307-315

Materialized views are logically excess stored query results in SQL-oriented databases. This technology can significantly improve the performance of database systems. Although the idea of materialized views came up in the 1980s, only three database management systems, i.e. DB2, Oracle, SQL Server, have been successfully developed completely enough with materialized views so far. The barrier lies in building a module that can incrementally update the materialized views automatically, which corresponds to data changes in the base tables. This paper presents the algorithm to incrementally update the materialized views with inner join, focusing on one with aggregate functions, and building of a program that automatically generates codes inPL/pgSQL for triggers, which can undertake synchronous incremental updates of the materialized views in PostgreSQL. 相似文献

3.

Web仓储中的单视图一致性

张岩杨冬青唐世渭《计算机研究与发展》2004,41(1):194-200

Web仓储系统使用物化视图的方法管理和维护Web数据，它可以给用户的查询和分析带来更快的效率，特别适合联机分析处理(OLAP)和决策支持。Web环境中数据更新非常频繁，为保持系统的时新性(freshness)，需要不断刷新物化视图，在Web视图刷新的过程中，必须保持物化视图与基础数据之间的一致性(称为单视图一致性，SVC)，否则系统中的数据就会产生错误，进而影响用户的正确使用，围绕这种单视图一致性，针对Web环境的特性，给出了相关保持算法，这些算法具有良好的Web环境适应性和伸缩性。相似文献

4.

实视图动态选择算法中重叠实视图的处理

雷旭袁捷《计算机工程》2006,32(6):79-81

当采用实视图来提高OLAP系统效率时，由于实视图往往并不恰巧是一个完整的格节点，即实视图是多维数据切片（MRFs），因此系统中会出现大量有重叠数据的实视图，这不仅占用了过多的存储空间。也使得系统根据已有实视图响应用户提交的多维查询变得复杂。以往的实视图动态选择算法没有考虑这种情况的处理。文章结合格模型的概念，提出了合并数据重叠实视图的算法，包括如何判定实视图之间有重叠数据、如何合并有数据重叠的实视图等。相似文献

5.

Recommending XML physical designs for XML databases

Iman Elghandour Ashraf Aboulnaga Daniel C. Zilio Calisto Zuzarte 《The VLDB Journal The International Journal on Very Large Data Bases》2013,22(4):447-470

Database systems employ physical structures such as indexes and materialized views to improve query performance, potentially by orders of magnitude. It is therefore important for a database administrator to choose the appropriate configuration of these physical structures for a given database. XML database systems are increasingly being used to manage semi-structured data, and XML support has been added to commercial database systems. In this paper, we address the problem of automatic physical design for XML databases, which is the process of automatically selecting the best set of physical structures for a database and a query workload. We focus on recommending two types of physical structures: XML indexes and relational materialized views of XML data. We present a design advisor for recommending XML indexes, one for recommending materialized views, and an integrated design advisor that recommends both indexes and materialized views. A key characteristic of our advisors is that they are tightly coupled with the query optimizer of the database system, and they rely on the optimizer for enumerating and evaluating physical designs. We have implemented our advisors in a prototype version of IBM DB2 V9, and we experimentally demonstrate the effectiveness of their recommendations using this implementation. 相似文献

6.

View selection for designing the global data warehouse 总被引：1，自引：0，他引：1

Dimitri Spyros Timos 《Data & Knowledge Engineering》2001,39(3):219-240

A global data warehouse (DW) integrates data from multiple distributed heterogeneous databases and other information sources. A global DW can be abstractly seen as a set of materialized views. The selection of views for materialization in a DW is an important decision in the design of a DW. Current commercial products do not provide tools for automatic DW design. We provide a general method that, given a set of select-project-join queries to be satisfied by the DW, generates sets of materialized views that satisfy all the input queries. This process is complex since ‘common subexpressions' between the queries need to be detected and exploited. Our method is then applied to solve the problem of selecting such a materialized view set that fits in the space allocated to the DW for materialization and minimizes the combined overall query evaluation and view maintenance cost. We design algorithms which are implemented and we report on their experimental evaluation. 相似文献

7.

Finding an efficient rewriting of OLAP queries using materialized views in data warehouses

Chang-Sup Myoung Ho Yoon-Joon 《Decision Support Systems》2002,32(4)

OLAP queries involve a lot of aggregations on a large amount of data in data warehouses. To process expensive OLAP queries efficiently, we propose a new method to rewrite a given OLAP query using various kinds of materialized views which already exist in data warehouses. We first define the normal forms of OLAP queries and materialized views based on the selection and aggregation granularities, which are derived from the lattice of dimension hierarchies. Conditions for usability of materialized views in rewriting a given query are specified by relationships between the components of their normal forms. We present a rewriting algorithm for OLAP queries that can effectively utilize materialized views having different selection granularities, selection regions, and aggregation granularities together. We also propose an algorithm to find a set of materialized views that results in a rewritten query which can be executed efficiently. We show the effectiveness and performance of the algorithm experimentally. 相似文献

8.

超市数据仓库中物化视图的选择与调整策略

姜合杨春花耿玉水《计算机应用与软件》2007,24(3):91-93

物化视图选择是数据仓库研究领域的一个重要课题,其选择策略直接影响到数据仓库的查询效率.通过对超市数据仓库的设计及已有研究成果的分析,对物化视图的选择算法做了一些改进,并给出了一种据查询情况的变化动态调整物化视图集的算法. 相似文献

9.

基于多实化视图增量维护的基库生成算法

杜威潘久辉邹先霞《计算机工程与应用》2006,42(8):175-177

数据仓库的维护是数据仓库应用中的一个十分重要的问题,近几年产生了很多的维护算法。已有的维护算法多是针对单个实化视图的维护;或只针对简单SPJ视图的维护;或只针对聚集函数的维护;而实际的数据仓库大多是由包含聚集函数的多个实化视图组成。因此综合考虑包含聚集函数的多个实化视图的维护问题是必然的。文章正是在此情况下提出了一种基于多实化视图增量维护的基库生成算法,在《基于基库的多实化视图增量维护算法》中提出了包含聚集函数的多实化视图的维护算法。相似文献

10.

数据仓库自维护下视图分解系统的设计与实现

毛莉潘久辉《计算机工程与设计》2007,28(15):3800-3802

数据仓库自维护实质上是通过维护实化视图实现,然而现有的实化视图自维护策略不能有效的减少数据仓库集成端和数据源监视端的多余数据,从而影响数据仓库环境的整体响应速度.一种基于数据仓库自维护方法的视图分解系统改进了现有的视图分解模式,将全局定义的实化视图分解成局部定义的单源视图集来减少存在数据仓库中不必要的数据,实现了现有实化视图自维护策略的分解和重写,提高数据仓库自维护效率. 相似文献

11.

数据仓库环境中可扩展的动态物化视图选择方法

衣振萍潘景昌郭强姜斌《计算机与现代化》2007,(8):74-77

数据仓库通常要对大量的数据进行运算,以精简的结果来回答用户的查询,这一特点使得物化视图技术在数据仓库中尤为重要.然而现有支持物化视图自动选择的方法是静态的,它违背了联机分析处理(OLAP)和决策支持系统(DSS)的动态本质.本文提出了可扩展的动态物化视图方法,通过将整个物化视图选择问题(MVS)分解为三个阶段,降低了问题的复杂度,提高了物化视图的有效性.通过动态调整,物化视图能即时适应查询需求.算法复杂度分析证明了方案的可扩展性.动态调整算法模拟实验验证了方案具有很好的自适应性. 相似文献

12.

Detecting redundant materialized views in data warehouse evolution

《Information Systems》2001,26(5):363-381

A data warehouse (DW) can be abstractly seen as a set of materialized views defined over a set of remote data sources. A DW is intended to satisfy a set of queries. The views materialized in a DW relate to each other in a complex manner, through common subexpressions, in order to guarantee high query performance and low view maintenance cost. DWs are time varying. As time passes new materialized views are added in order to satisfy new queries, or for performance reasons, while old queries are dropped. The evolution of a DW can result in a redundant set of materialized views. In this paper, we address the problem of detecting redundant materialized views in a given DW view selection, that is, materialized views that can be removed from DW without negatively affecting the query evaluation or the view maintenance process. Using an AND/OR dag representation for multiple queries and views, we first formalize the process of propagating source relation changes to the materialized views by exploiting common subexpressions between views and by using other materialized views that are not affected by these changes. Then, we provide an algorithm for detecting materialized views that are not needed in the process of propagating source relation changes to the DW. We also show how trivially redundant views can be identified in this process. Finally, we use these results to provide a procedure for detecting materialized views that are redundant in a DW. Our approach considers a broad class of views that includes grouping/aggregation views and is not dependent on a specific cost model. 相似文献

13.

Queries and materialized views on probabilistic databases

Nilesh Dalvi Christopher Re Dan Suciu 《Journal of Computer and System Sciences》2011,77(3):473-490

We review in this paper some recent yet fundamental results on evaluating queries over probabilistic databases. While one can see this problem as a special instance of general purpose probabilistic inference, we describe in this paper two key database specific techniques that significantly reduce the complexity of query evaluation on probabilistic databases. The first is the separation of the query and the data: we show here that by doing so, one can identify queries whose data complexity is #P-hard, and queries whose data complexity is in PTIME. The second is the aggressive use of previously computed query results (materialized views): in particular, by rewriting a query in terms of views, one can reduce its complexity from #P-complete to PTIME. We describe a notion of a partial representation for views, and show that, once computed and stored, this partial representation can be used to answer subsequent queries on the probabilistic databases. evaluation. 相似文献

14.

Functional and embedded dependency inference: a data mining point of view

《Information Systems》2001,26(7):477-506

The issue of discovering functional dependencies from populated databases has received a great deal of attention because it is a key concern in database analysis. Such a capability is strongly required in database administration and design while being of great interest in other application fields such as query folding. Investigated for long years, the issue has been recently addressed in a novel and more efficient way by applying principles of data mining algorithms. The two algorithms fitting in such a trend are TANE and Dep-Miner. They strongly improve previous proposals. In this paper, we propose a new approach adopting a data mining point of view. We define a novel characterization of minimal functional dependencies. This formal framework is sound and simpler than related work. We introduce the new concept of free set for capturing source of functional dependencies. By using the concepts of closure and quasi-closure of attribute sets, targets of such dependencies are characterized. Our approach is enforced through the algorithm FUN which is particularly efficient since it is comparable or improves the two best operational solutions (according to our knowledge): TANE and Dep-Miner. It makes use of various optimization techniques and it can work on very large databases. Applying on real life or synthetic data more or less correlated, comparative experiments are performed in order to assess performance of FUN against TANE and Dep-Miner. Moreover, our approach also exhibits (without significant additional execution time) embedded functional dependencies, i.e. dependencies captured in any subset of the attribute set originally considered. Embedded dependencies capture a knowledge specially relevant in all fields where materialized data sets are managed (e.g. materialized views widely used in data warehouses). 相似文献

15.

Efficient approaches for materialized views selection in a data warehouse

Ming-Chuan Hung Man-Lin Huang Nien-Lin Hsueh 《Information Sciences》2007,177(6):1333-1348

View materialization is an effective method to increase query efficiency in a data warehouse and improve OLAP query performance. However, one encounters the problem of space insufficiency if all possible views are materialized in advance. Reducing query time by means of selecting a proper set of materialized views with a lower cost is crucial for efficient data warehousing. In addition, the costs of data warehouse creation, query, and maintenance have to be taken into account while views are materialized. In this paper, we propose efficient algorithms to select a proper set of materialized views, constrained by storage and cost considerations, to help speed up the entire data warehousing process. We derive a cost model for data warehouse query and maintenance as well as efficient view selection algorithms that effectively exploit the gain and loss metrics. The main contribution of our paper is to speed up the selection process of materialized views. Concurrently, this will greatly reduce the overall cost of data warehouse query and maintenance. 相似文献

16.

数据仓库性能优化之索引和物化视图耦合方法

马莹莹戴牡红《计算机应用研究》2013,30(3):835-837

为了进一步提高数据仓库的性能, 通过分析数据仓库中性能优化技术的特点, 提出了索引和物化视图耦合的性能优化技术。通过数据挖掘自动选择候选索引和物化视图, 减少查询的扫描范围; 然后研究在物化视图上建立索引的空间高效存储方法, 以提高查询速率; 最后利用成本模型对耦合情况进行分析, 验证了耦合方法可以极大提高单一索引查询或者物化视图的性能。相似文献

17.

数据仓库中基于实体化辅助视图的视图增量维护

胡孔法董逸生赵庆建《小型微型计算机系统》2003,24(2):251-254

为了加快对大量数据的查询处理速度，通常在数据仓库以实视图方式存储数据，当基础数据发生变化时，这些实视图也必须随着更新，因而视图自维护和一致性维护成为数据仓库的重要问题。本文提出利用视图计算的中间结果创建辅助视图，在数据仓库中进行实体化，采用有效的增量维护算法计算实视图的精确变化，实现数据仓库视图自维护。相似文献

18.

Data mining-based materialized view and index selection in data warehouses

Kamel Aouiche Jérôme Darmont 《Journal of Intelligent Information Systems》2009,33(1):65-93

Materialized views and indexes are physical structures for accelerating data access that are casually used in data warehouses. However, these data structures generate some maintenance overhead. They also share the same storage space. Most existing studies about materialized view and index selection consider these structures separately. In this paper, we adopt the opposite stance and couple materialized view and index selection to take view–index interactions into account and achieve efficient storage space sharing. Candidate materialized views and indexes are selected through a data mining process. We also exploit cost models that evaluate the respective benefit of indexing and view materialization, and help select a relevant configuration of indexes and materialized views among the candidates. Experimental results show that our strategy performs better than an independent selection of materialized views and indexes. 相似文献

19.

DBToaster: higher-order delta processing for dynamic,frequently fresh views

Christoph Koch Yanif Ahmad Oliver Kennedy Milos Nikolic Andres Nötzli Daniel Lupei Amir Shaikhha 《The VLDB Journal The International Journal on Very Large Data Bases》2014,23(2):253-278

Applications ranging from algorithmic trading to scientific data analysis require real-time analytics based on views over databases receiving thousands of updates each second. Such views have to be kept fresh at millisecond latencies. At the same time, these views have to support classical SQL, rather than window semantics, to enable applications that combine current with aged or historical data. In this article, we present the DBToaster system, which keeps materialized views of standard SQL queries continuously fresh as data changes very rapidly. This is achieved by a combination of aggressive compilation techniques and DBToaster’s original recursive finite differencing technique which materializes a query and a set of its higher-order deltas as views. These views support each other’s incremental maintenance, leading to a reduced overall view maintenance cost. DBToaster supports tens of thousands of complete view refreshes per second for a wide range of queries. 相似文献

20.

数据仓库中物化视图的选取策略

王云峰张祖平《计算技术与自动化》2004,23(3):43-46

数据仓库中用存储大量的物化视图来加速OLAP的查询响应，物化视图的选取是数据仓库设计中的一个重要问题。论文提出了一个有效的物化视图选取算法，采用基于数据立方体层次搜索的方式选取视图。经分析与测试表明，该算法取得良好的效果和效率。相似文献