首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
反射式中间件是当前中间件研究的热点,其特点是能够对中间件系统的灵活性和自适应性提供良好的支持。数据网格通常处于动态变化的环境和不同的用户需求之中,对灵活性和自适应性的支持是一个良好的数据网格和数据复制系统所必需的,通过动态配置数据和SE(storage element)的优先级,DPRM(Dynamic priority based Reflective Middleware for Data Replication)提供了这一支持。使用OptorSim仿真的结果表明,DPRM有着良好的性能。  相似文献   

2.
Data grid is a distributed collection of storage and computational resources that are not bounded within a geophysical location. It is a fast growing area of research and providing efficient data access and maximum data availability is a challenging task. To achieve this task, data is replicated to different sites. A number of data replication techniques have been presented for data grids. All replication techniques address some attributes like fault tolerance, scalability, improved bandwidth consumption, performance, storage consumption, data access time etc. In this paper, different issues involved in data replication are identified and different replication techniques are studied to find out which attributes are addressed in a given technique and which are ignored. A tabular representation of all those parameters is presented to facilitate the future comparison of dynamic replication techniques. The paper also includes some discussion about future work in this direction by identifying some open research problems.  相似文献   

3.
数据网格关键技术分析   总被引:5,自引:0,他引:5  
该文阐述了数据网格与计算网格的关系,并分析了数据网格的基本特征和体系结构。重点研究了构建数据网格的多种关键技术。最后通过一个数据网格工程的实例,探讨了数据网格的应用领域及实践价值。  相似文献   

4.
In recent years, grid technology has had such a fast growth that it has been used in many scientific experiments and research centers. A large number of storage elements and computational resources are combined to generate a grid which gives us shared access to extra computing power. In particular, data grid deals with data intensive applications and provides intensive resources across widely distributed communities. Data replication is an efficient way for distributing replicas among the data grids, making it possible to access similar data in different locations of the data grid. Replication reduces data access time and improves the performance of the system. In this paper, we propose a new dynamic data replication algorithm named PDDRA that optimizes the traditional algorithms. Our proposed algorithm is based on an assumption: members in a VO (Virtual Organization) have similar interests in files. Based on this assumption and also file access history, PDDRA predicts future needs of grid sites and pre-fetches a sequence of files to the requester grid site, so the next time that this site needs a file, it will be locally available. This will considerably reduce access latency, response time and bandwidth consumption. PDDRA consists of three phases: storing file access patterns, requesting a file and performing replication and pre-fetching and replacement. The algorithm was tested using a grid simulator, OptorSim developed by European Data Grid projects. The simulation results show that our proposed algorithm has better performance in comparison with other algorithms in terms of job execution time, effective network usage, total number of replications, hit ratio and percentage of storage filled.  相似文献   

5.
adPD:一种速度自适应的动态并行下载技术   总被引:5,自引:0,他引:5  
本文在介绍了现有的并行下载算法的基础上提出了一种新的速度自适应的动态并行下载机制-adPD。adPD通过为速度不同的连接动态分配大小不同的下载任务,可以很好地适应传输连接速度的变化,做到按速度比例分配下载任务量,充分利用带宽。同时,通过划分大小不固定的文件分块,adPD还可以尽可能地减少发送数据请求的数量,缩短请求等待的空闲时间,在减轻提供服务的节点的负载的同时,提高了下载速度。最后,通过实验结果分析了adPD的实际性能,验证了adPD是一种高效的并行下载算法。  相似文献   

6.
Data replication techniques are used in data grid to reduce makespan, storage consumption, access latency and network bandwidth. Data replication enhances data availability and thereby increases the system reliability. There are two steps involved in data replication, namely, replica placement and replica selection. Replica placement involves identifying the best possible node to duplicate data based on network latency and user request. Replica selection involves selecting the best replica location to access the data for job execution in the data grid. Various replica placement and selection algorithms are available in the literature. These algorithms measure and analyze different parameters such as bandwidth consumption, access cost, scalability, execution time, storage consumption and makespan. In this paper, various replica placement and selection strategies along with their merits and demerits are discussed. This paper also analyses the performance of various strategies with respect to the parameters mentioned above. In particular, this paper focuses on the dynamic replica placement and selection strategies in the data grid environment.  相似文献   

7.
邓伟  李季  董晓华  朱郑州  吴中福 《计算机工程》2007,33(6):156-157,169
数据网格资源的增加、地域分布的扩大,引起了网络性能的下降及传输的延迟,这些问题使得动态复制算法的可扩展性变得尤为重要。而目前数据网格中的动态复制算法均不具备良好的可扩展性,为此该文提出了基于分层管理的拓扑HMLT上的动态复制选址策略,并将其应用于远程教育的资源管理中。  相似文献   

8.
Real-time Grid applications are emerging in many disciplines of science and engineering. In order to run these applications while meeting the associated real-time constraints with them, the Grid infrastructure should be designed to respect these constraints and allocate its computing, networking, storage, and the other resources accordingly. Furthermore, these applications involve a large number of data intensive jobs and require to access terabytes of data in real-time. On the other hand, a variety of dynamic file replication algorithms were proposed for the best-effort Data Grid environments in an attempt to decrease job completion times and save network bandwidth. Until now, there is no study in the literature which tries to elaborate on the real-time performance of these dynamic file replication algorithms. Based on this motivation, in this study, the performance of eight dynamic replication algorithms are evaluated under various Data Grid settings. For this evaluation, a process oriented and discrete-event driven simulator called DGridSim is developed. A detailed set of simulation studies are conducted using DGridSim and the results obtained are presented to reveal the real-time performance of the dynamic file replication algorithms.  相似文献   

9.
This paper studies the Quality-of-Service (QoS)-aware replica placement problem in a general graph model. Since the problem was proved NP-hard, heuristic algorithms are the current solutions to the problem. However, these algorithms cannot always find the effective replica placement strategy. We propose two algorithms that can obtain better results within the given time period. The first algorithm is called Cover Distance algorithm, which is based on the Greedy Cover algorithm. The second algorithm is an optimized genetic algorithm, in which we use random heuristic algorithms to generate initial population to avoid enormous useless searching. Then, the 0-Greedy-Delete algorithm is used to optimize the genetic algorithm solutions. According to the performance evaluation, our Cover Distance algorithm can obtain relatively better solution in time critical scenarios. Whereas, the optimized genetic algorithm is better when the replica cost is of higher priority than algorithm execution time. The QoS-aware data replication heuristic algorithms are applied into the data distribution service of an astronomy data grid pipeline prototype, and the operation process is studied in detail.  相似文献   

10.
With the availability of content delivery networks (CDN), many database-driven Web applications rely on data centers that host applications and database contents for better performance and higher reliability. However, it raises additional issues associated with database/data center synchronization, query/transaction routing, load balancing, and application result correctness/precision. In this paper, we investigate the issues in the context of data center synchronization for such load and precision critical Web applications in a distributed data center infrastructure. We develop a scalable scheme for adaptive synchronization of data centers to maintain the load and application precision requirements. A prototype has been built for the evaluation of the proposed scheme. The experimental results show the effectiveness of the proposed scheme in maintaining both application result precision and load distribution; adapting to traffic patterns and system capacity limits.  相似文献   

11.
ETL(Extraction-Transformation-Loader)是企业内部和企业间信息资源交换和共享的关键技术。随着企业数据量的剧增,如何提高数据处理能力和执行效率成为ETL需要解决的难题之一。提出一个基于缓存的并发ETL数据流程处理框架,该框架使用基于组件分类的缓存复用技术来降低内存消耗和数据拷贝次数;同时使用一种并发的数据处理流程调度执行策略,该策略具有任务、流水线、数据处理多粒度并行的特点。该方法已在网驰平台ONCE DQ实现并得到验证。  相似文献   

12.
并行光线跟踪中的数据划分策略   总被引:1,自引:0,他引:1  
提出了一种新的场景数据分割方法,以待计算的象素点来决定场景数据的有效分割,克服了以往盲目分割数据所带来的光线在处理机间的频繁交换,从而减小了通信开销。  相似文献   

13.
为了解决实际问题,大数据分析处理系统需要获取数据,然而实际场景中收集到的实际数据通常不完备.另外,大多数问题的解决方案通常是由问题引导或者仅仅进行数据分析,运行参数调整和设定带有较大的盲目性,难以达到应用的智能性.为此,文中提出平行数据的概念和框架,根据实际数据经计算实验产生真正的虚拟大数据,结合默顿定律,以期待的解决方案与问题进行广义对偶,引导大数据聚焦到实际问题.实际数据与虚拟数据动态互动,平行演化,形成一个虚实相生、数据动态变化的过程,最终使数据具备智能,进而解决未知的问题.平行数据不但是一种数据表示形式,更是一种数据演化机制与方式,其特色是虚实互动,所有数据的动力学轨迹构成了数据动力学系统.平行数据为数据处理、表示、挖掘和应用提供了一个新的范式.  相似文献   

14.
并行数据库系统的数据重组研究   总被引:2,自引:0,他引:2  
数据倾斜对并行数据库系统性能的影响极大,本文提出解决初始数据倾斜的记录移动法,确定了数据平衡的记录移动法,负载平衡的记录移动法,及通常的重新划分之间取得的界线,并举例进行模拟研究。  相似文献   

15.
本文提出了一种能克服各种数据偏斜、高效的、并行二元连接运算算法,可在不同的数据偏斜情况下启动不同的模块,克服数据偏斜造成的负载不平衡现象。  相似文献   

16.
A dynamic data replication strategy using access-weights in data grids   总被引:2,自引:0,他引:2  
Data grids deal with a huge amount of data regularly. It is a fundamental challenge to ensure efficient accesses to such widely distributed data sets. Creating replicas to a suitable site by data replication strategy can increase the system performance. It shortens the data access time and reduces bandwidth consumption. In this paper, a dynamic data replication mechanism called Latest Access Largest Weight (LALW) is proposed. LALW selects a popular file for replication and calculates a suitable number of copies and grid sites for replication. By associating a different weight to each historical data access record, the importance of each record is differentiated. A more recent data access record has a larger weight. It indicates that the record is more pertinent to the current situation of data access. A Grid simulator, OptorSim, is used to evaluate the performance of this dynamic replication strategy. The simulation results show that LALW successfully increases the effective network usage. It means that the LALW replication strategy can find out a popular file and replicates it to a suitable site without increasing the network burden too much.
Ruay-Shiung ChangEmail:
  相似文献   

17.
提出了一种基于访问频率的副本创建策略。该策略主要依据网格用户对文件副本的访问频率进行副本创建,在替换副本时也依据频率值,将不经常访问的副本删除。这种策略能够很好地满足用户访问所需副本的要求,并能提高副本的传输速率与带宽的利用率。文章根据网格结构的特点和算法的环境要求对网格模拟器OptorSim的模块进行了改进,并对该算法进行了测试。测试结果表明,基于访问频率的副本创建算法提高了用户访问副本的效率。  相似文献   

18.
本文首先阐述了多主体复制的基本概念及其优点,然后分析了基于多主体复制的Oracle数据同步的基本原理,最后举例详细说明了实现基于多主体复制的Oracle数据同步的完整过程。  相似文献   

19.
广域网系统中的数据复制方案   总被引:2,自引:0,他引:2  
王晓峰  李宛洲 《计算机工程》2001,27(4):116-117,190
在分析当前比较浒的几种数据复制技术的基础上,根据系统的实际情况,提出了将基于Sybase的复制服务器技术与自行开发的、以加密文件为传送媒质的复制相结合的复制方案。  相似文献   

20.
分布式数据库数据复制技术的分析与应用   总被引:19,自引:2,他引:19  
本文在分布式数据库系统的基础上,针对分布式数据库的数据存储方式进行了概述,细致地讲述了分布式数据库复制技术的几个关键内容,对Oracle、Sybase、SQL Server数据库的复制技术(方案)进行了综合分析比较,并结合实践经验提出了复制技术的方案选择策略,对复制技术的发展进行了展望。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号