首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
在传统的实化视图维护时,数据源把增量数据以XML文档的方式报送给数据仓库,数据仓库从此文档中解析出数据,利用JDBC完成对实化视图的更新。文中提出在数据源把增量数据封装成序列化对象存储于文件中再报送给数据仓库,而数据仓库从文件中读出对象,利用Hibernate直接把对象更新到实化视图。通过两种方案性能的比较,表明后一种方案是可行并且更加高效的。  相似文献   

2.
分布式数据源的实视图维护算法研究   总被引:1,自引:0,他引:1  
数据仓库作为决策支持系统的集成化数据中心,其数据可以认为是定义在多个不同数据源的实视图集。近年来数据仓库中实视图维护算法的研究激起很多学者的重视。当多个独立的数据源出现并发更新时传统的实视图维护算法可能导致视图维护异常,本文提出了一个双向扫描并行处理实视图维护(BSP)算法,能确保实视图与数据源的完全一致性,并通过实验与其它类似的算法进行了比较,说明本算法具有较高的效率。  相似文献   

3.
为了加快对大量数据的查询处理速度,通常在数据仓库以实视图方式存储数据,当基础数据发生变化时,这些实视图也必须随着更新,因而视图自维护和一致性维护成为数据仓库的重要问题。本文提出利用视图计算的中间结果创建辅助视图,在数据仓库中进行实体化,采用有效的增量维护算法计算实视图的精确变化,实现数据仓库视图自维护。  相似文献   

4.
We consider the problems of computing aggregation queries in temporal databases and of maintaining materialized temporal aggregate views efficiently. The latter problem is particularly challenging since a single data update can cause aggregate results to change over the entire time line. We introduce a new index structure called the SB-tree, which incorporates features from both segment-trees and B-trees. SB-trees support fast lookup of aggregate results based on time and can be maintained efficiently when the data change. We extend the basic SB-tree index to handle cumulative (also called moving-window) aggregates, considering separatelycases when the window size is or is not fixed in advance. For materialized aggregate views in a temporal database or warehouse, we propose building and maintaining SB-tree indices instead of the views themselves.Received: 20 March 2001, Accepted: 21 March 2001, Published online: 17 September 2003This work was supported by the National Science Foundation under grant IIS-9811947 and by NASA Ames under grant NCC2-5278.Edited by R. Snodgrass  相似文献   

5.
Consistency Algorithms for Multi-Source Warehouse View Maintenance   总被引:1,自引:0,他引:1  
A warehouse is a data repository containing integrated information for efficient querying and analysis. Maintaining the consistency of warehouse data is challenging, especially if the data sources are autonomous and views of the data at the warehouse span multiple sources. Transactions containing multiple updates at one or more sources, e.g., batch updates, complicate the consistency problem. In this paper we identify and discuss three fundamental transaction processing scenarios for data warehousing. We define four levels of consistency for warehouse data and present a new family of algorithms, the Strobe family, that maintain consistency as the warehouse is updated, under the various warehousing scenarios. All of the algorithms are incremental and can handle a continuous and overlapping stream of updates from the sources. Our implementation shows that the algorithms are practical and realistic choices for a wide variety of update scenarios.  相似文献   

6.
实体化视图作为数据仓库中存储的主要信息实体是由对上一级或外部数据源进行抽取、转化、传输和上载的数据构成的.当源数据发生变化时,如何进行数据仓库实体化视图的一致性维护以及0LAP查询,是一个有着实际意义的研究课题.本文提出一个改进性算法Glide*,该算法采用补偿思想来协调源数据库及实体化视图的一致性,从而对系统内存开销及维护工作量方面都有很大的改进.文章还通过一个示例说明了该算法在实际中的具体运用.  相似文献   

7.
数据仓库视图一致性维护与下查研究   总被引:4,自引:0,他引:4  
数据仓库是存储供查询和决策分析用的集成化信息仓库。实体化视图作为数据仓库中存储的主要信息实体,是由对上一级或外部数据源进行抽取、转化、传输和上载的数据构成的。当源数据发生变化时,如何进行数据仓库实体化视图的一致性维护以及OLAP查询,是一个有着实际意义的研究课题。论文提出的算法Glide采用版本控制、补偿思想和应答机制来协调源数据库与数据仓库间的数据更新,保证了数据仓库视图维护与下查的一致性,提高了算法的健壮程度和对源数据库端CPU的利用率,是以往同类算法的一个本质改进。论文指出算法Glide是完全一致的,并给出了严格的数学证明。文章还通过一个示例说明了该算法在实际中的具体运用。  相似文献   

8.
Indexing mobile objects using dual transformations   总被引:4,自引:0,他引:4  
With the recent advances in wireless networks, embedded systems, and GPS technology, databases that manage the location of moving objects have received increased interest. In this paper, we present indexing techniques for moving object databases. In particular, we propose methods to index moving objects in order to efficiently answer range queries about their current and future positions. This problem appears in real-life applications such as predicting future congestion areas in a highway system or allocating more bandwidth for areas where a high concentration of mobile phones is imminent. We address the problem in external memory and present dynamic solutions, both for the one-dimensional and the two-dimensional cases. Our approach transforms the problem into a dual space that is easier to index. Important in this dynamic environment is not only query performance but also the update processing, given the large number of moving objects that issue updates. We compare the dual-transformation approach with the TPR-tree, an efficient method for indexing moving objects that is based on time-parameterized index nodes. An experimental evaluation shows that the dual-transformation approach provides comparable query performance but has much faster update processing. Moreover, the dual method does not require establishing a predefined query horizon.Received: 27 April 2003, Accepted: 11 May 2004, Published online: 14 September 2004Edited by: J. VeijalainenGeorge Kollios: Supported by NSF CAREER Award 0133825.Dimitrios Gunopulos: Supported by NSF ITR 0220148, NSF CAREER Award 9984729, NSF IIS-9907477, and NRDRP.Vassilis J. Tsotras: Supported by NSF IIS-9907477, NSF EIA-9983445 and the DoD.  相似文献   

9.
Data integration over multiple heterogeneous data sources has become increasingly important for modern applications. The integrated data is usually stored as materialized views to allow better access, performance, and high availability. In loosely coupled environments, such as the data grid, the data sources are autonomous. Hence, tie source updates can be concurrent and cause erroneous results during view maintenance. State-of-the-art maintenance strategies apply compensating queries to correct such errors, making the restricting assumption that all source schemata remain static over time. However, in such dynamic environments, the data sources may change not only their data but also their schema. Consequently, either the maintenance queres or the compensating queries may fail. In this paper, we propose a novel framework called DyDa that overcomes these limitations and handles both source data updates and schema changes. We identify three types of maintenance anomalies, caused by either source data updates, data-preserving schema changes, or non-data-preserving schema changes. We propose a compensation algorithm to solve the first two types of anomalies. We show that the third type of anomaly is caused by the violation of dependencies between maintenance processes. Then, we propose dependency detection and correction algorithms to identify and resolve the violations. Put together, DyDa extends prior maintenance solutions to solve all types of view maintenance anomalies. The experimental results show that DyDa imposes a minimal overhead on data update processing while allowing for the extended functionality to handle concurrent schema changes.  相似文献   

10.
在数据仓库的实化视图维护处理中,如何有效地处理并发更新是一个重要而又棘手的问题.文中阐述了P2P环境下模式与数据全面并发的典型情形,分析了因并发更新而导致视图维护异常的原因,针对这些不同的方面提出相应的纠正策略.给出了一种基于时态演算的并发更新侦测方法,以及混合更新下对关联更新进行检测的有效算法,最后提出了解决乱序提交问题的增强代理机制,确保了数据仓库与数据源的一致性.  相似文献   

11.
View materialization is an effective method to increase query efficiency in a data warehouse and improve OLAP query performance. However, one encounters the problem of space insufficiency if all possible views are materialized in advance. Reducing query time by means of selecting a proper set of materialized views with a lower cost is crucial for efficient data warehousing. In addition, the costs of data warehouse creation, query, and maintenance have to be taken into account while views are materialized. In this paper, we propose efficient algorithms to select a proper set of materialized views, constrained by storage and cost considerations, to help speed up the entire data warehousing process. We derive a cost model for data warehouse query and maintenance as well as efficient view selection algorithms that effectively exploit the gain and loss metrics. The main contribution of our paper is to speed up the selection process of materialized views. Concurrently, this will greatly reduce the overall cost of data warehouse query and maintenance.  相似文献   

12.
数据仓库实化视图的联机维护是数据仓库系统维护的一项关键技术,采用这种技术,能够在不影响用户正常业务的情况下,实现数据仓库中实化视图数据的及时更新。但联机分析处理(OLAP)作为数据仓库的一个主要应用,在数据仓库实化视图的联机维护过程中会产生严重的数据不 一致问题。为了解决这个问题,引入“维护库”(Maintaining Database)的概念,提出基于事务触发的视图维护算法TVM,采取应答机制,达到数据的一致性。  相似文献   

13.
数据仓库中实现化视图的一致性维护虎法—ECA算法的实现   总被引:2,自引:0,他引:2  
数据仓库系统中的视图不仅仅是一个逻辑上的概念,同时也是物理存在的,当数据源上的内容发生变化时,我们必须相应地修改数据仓库中的数据,以保证二者数据的一致性,本文详细介绍了单数原实化视图的一致性维护算法-ECA算法的实现方法。  相似文献   

14.
数据仓库中物化视图维护算法的分析和比较   总被引:1,自引:0,他引:1  
随着数据源的更新,数据仓库中的物化视图必须得到及时的更新维护.而如何对物化视图进行高效的更新,以满足用户对查询响应速度和查询结果一致性、时新性的要求,这是数据仓库技术中非常复杂和重要的工作,也是一个迫切需要解决的关键性技术问题.以物化视图更新维护问题为主要研究对象,通过对现有各种维护算法深入的研究和分析,系统地进行了比较和总结,最后指出了谊问题深入研究的方向.  相似文献   

15.
《Information Systems》2001,26(5):363-381
A data warehouse (DW) can be abstractly seen as a set of materialized views defined over a set of remote data sources. A DW is intended to satisfy a set of queries. The views materialized in a DW relate to each other in a complex manner, through common subexpressions, in order to guarantee high query performance and low view maintenance cost. DWs are time varying. As time passes new materialized views are added in order to satisfy new queries, or for performance reasons, while old queries are dropped. The evolution of a DW can result in a redundant set of materialized views. In this paper, we address the problem of detecting redundant materialized views in a given DW view selection, that is, materialized views that can be removed from DW without negatively affecting the query evaluation or the view maintenance process. Using an AND/OR dag representation for multiple queries and views, we first formalize the process of propagating source relation changes to the materialized views by exploiting common subexpressions between views and by using other materialized views that are not affected by these changes. Then, we provide an algorithm for detecting materialized views that are not needed in the process of propagating source relation changes to the DW. We also show how trivially redundant views can be identified in this process. Finally, we use these results to provide a procedure for detecting materialized views that are redundant in a DW. Our approach considers a broad class of views that includes grouping/aggregation views and is not dependent on a specific cost model.  相似文献   

16.
Operator scheduling in data stream systems   总被引:5,自引:0,他引:5  
In many applications involving continuous data streams, data arrival is bursty and data rate fluctuates over time. Systems that seek to give rapid or real-time query responses in such an environment must be prepared to deal gracefully with bursts in data arrival without compromising system performance. We discuss one strategy for processing bursty streams - adaptive, load-aware scheduling of query operators to minimize resource consumption during times of peak load. We show that the choice of an operator scheduling strategy can have significant impact on the runtime system memory usage as well as output latency. Our aim is to design a scheduling strategy that minimizes the maximum runtime system memory while maintaining the output latency within prespecified bounds. We first present Chain scheduling, an operator scheduling strategy for data stream systems that is near-optimal in minimizing runtime memory usage for any collection of single-stream queries involving selections, projections, and foreign-key joins with stored relations. Chain scheduling also performs well for queries with sliding-window joins over multiple streams and multiple queries of the above types. However, during bursts in input streams, when there is a buildup of unprocessed tuples, Chain scheduling may lead to high output latency. We study the online problem of minimizing maximum runtime memory, subject to a constraint on maximum latency. We present preliminary observations, negative results, and heuristics for this problem. A thorough experimental evaluation is provided where we demonstrate the potential benefits of Chain scheduling and its different variants, compare it with competing scheduling strategies, and validate our analytical conclusions.Received: 18 October 2003, Accepted: 16 April 2004, Published online: 14 September 2004Edited by: J. Gehrke and J. HellersteinBrian Babcock: Supported in part by a Rambus Corporation Stanford Graduate Fellowship and NSF Grant IIS-0118173.Shivnath Babu: Supported in part by NSF Grants IIS-0118173 and IIS-9817799.Mayur Datar: Supported in part by Siebel Scholarship and NSF Grant IIS-0118173.Rajeev Motwani: Supported in part by NSF Grant IIS-0118173, an Okawa Foundation Research Grant, an SNRC grant, and grants from Microsoft and Veritas.Dilys Thomas: Supported by NSF Grant EIA-0137761 and NSF ITR Award Number 0331640.  相似文献   

17.
最近几年,数据仓维护问题的重心已转移到各信息源并发更新下的视图维护。比较流行的算法,如ECA算法和Strobe算法在解决并发更新问题时需要数据仓处于静止状态。而文中的在线错误纠正方法不需要额外的本地补偿操作,并且在维护时不需要数据仓处于静止状态。进而该文提出了一个在线错误纠正的优化算法,称为并行在线错误纠正算法。该算法对原有的函数模块做了一定的修正和增强处理,并给出了修改了的函数功能模块图,它能完成并行维护,提高维护的性能。  相似文献   

18.
数据仓库体系结构是数据仓库建设和维护的重要理论基石,传统的体系框架简单易行,但不够完善。斯坦福大学提出的WHIPS模型解决了信息源自动侦测更新的问题,但由于模型自身存在的瓶颈,使并行更新处理产生阻塞现象。为此,本文提出了一个改进方案,引入了时间戳单元,增加了其中两个重要模块的并行处理能力,并给出一个修正的数
据仓库系统结构。  相似文献   

19.
Italiano  I.C. Ferreira  J.E. 《Computer》2006,39(3):53-57
Data warehouse use has increased significantly in recent years and now plays a fundamental role in many organizations' decision-support processes. A framework that uses parameter sets to define the most suitable synchronization option for a given transaction processing environment helps decrease the update time between the transactional and analytical systems and also reduces the hardware resources required to keep an acceptable data update. The frequency of a data warehouse loading process defines the points of update between the transaction systems and the warehouse with its analytical applications. Normally, data warehouses rely on static updates, with batch loading processes occurring at daily, weekly, monthly, or other periodic intervals. However, today's business needs require an analytical environment that provides (i) continuous data integration with shorter periods for capturing and loading from operational sources, (ii) An active decision engine that can make recommendations, and (iii) high availability. Synchronizing a data warehouse in real time with transactional systems thus requires reducing the interval between update points. To achieve this dynamic option, the analytical database system must immediately reflect updates on transactional data.  相似文献   

20.
基于访问和更新历史的WebView管理方法   总被引:1,自引:0,他引:1  
张岩  唐世渭  杨冬青 《计算机工程》2002,28(7):56-57,118
介绍一种名为MED(Materialize on accEss and upDate hIstory)的混合方法,它在用户查询时产生WebView并进行物化存储,但基础数据变化时这并不一定立刻变化,而是参考核WebView以前的更新历史, 然后采取相应的对策和方法,实验数据表明,MEDI算法对于更新和访问都比较频繁的Web环境具有有较好的适应性,性能优于纯虚拟或物化算法。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号