首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 115 毫秒
1.
基于遗传算法的副本管理策略   总被引:2,自引:0,他引:2  
在数据网格环境中,为了提高数据的可靠性和降低用户对数据访问的延迟,广泛采用了数据复制技术;由于副本技术的引入,就会面临一个很直接的问题,即一个应用程序如何根据副本性能和访问特性,从一组副本中选择一个最佳副本,就是副本选择问题.针对数据复制技术中的关键技术:副本创建和副本选择,在价格机制模型和并行数据传输的基础上,采用了遗传算法实现副本的优化管理.最后使用网格模拟器OptorSim对算法进行测试分析,结果显示基于遗传算法的策略有更好的性能.  相似文献   

2.
说明了有效的数据复制机制和副本管理策略是信息网格研究的重要内容.概述了应用Agent技术实现信息网格中数据复制的基本方法.探讨了在全国人口信息系统中基于Agent的数据复制方法的设计和实现,利用Agent的自主性、移动性和交互性等特点,实现上层创建下层副本,及同层间副本的创建.采用数据格式转换系统解决底层数据库的异构问题.重点分析了基于Agent的推动和拉动复制方法的实现特点,说明了基于Agent的数据复制方法在信息网格中的合理性和高效性.  相似文献   

3.
副本管理成为影响数据网格性能的主要因素之一,研究高效的副本管理算法大都依赖于对数据网格副本管理进行仿真.介绍了一种数据网格副本管理仿真软件的设计与实现,并详细介绍了数据网格仿真的一些关键技术的解决方案,如任务调度、任务执行仿真.  相似文献   

4.
副本技术是网格中提高数据访问和处理效率的关键技术。针对目前副本管理存在的局限性和亟待解决的一致性维护关键问题,以Globus提供的副本技术为基础,从副本创建与更新的角度出发,采用日志管理思想,提出了一种副本创建与一致性维护相结合的解决方案。  相似文献   

5.
网格中涉及到大量数据文件的复制和传输,数据的有效复制可以节省带宽、减少时延、均衡负载、改善系统可靠性。文中提出的采用基于访问频率的复制管理技术可以使用户有效地获得所需的数据,并采用一种改进的Fast Upload方法来更新文件的副本,减少了复制和更新所用的时间,提高系统的效率。  相似文献   

6.
由于全球海洋实时观测网Argo的数据量十分巨大,为了有效管理和使用这些数据,在分析Argo数据格式及特点的基础上,提出了利用网格技术对Argo数据进行共享管理的方法.应用网格的元数据管理、副本管理和数据访问与集成等关键技术来管理海量的Argo数据,使用户可以在任何地点、任何时间透明地访问和使用Argo海洋观测数据资源,从而为海洋科学研究获取海洋环境信息数据提供方便.  相似文献   

7.
沈薇  刘方爱 《微机发展》2006,16(11):185-187
网格中涉及到大量数据文件的复制和传输,数据的有效复制可以节省带宽、减少时延、均衡负载、改善系统可靠性。文中提出的采用基于访问频率的复制管理技术可以使用户有效地获得所需的数据,并采用一种改进的Fast Upload方法来更新文件的副本,减少了复制和更新所用的时间,提高系统的效率。  相似文献   

8.
科学计算领域中的科学数据呈现爆炸式增长,未来的科学计算将以数据为中心,数据网格计算技术成为解决复杂海量科学数据的访问和管理的一种有效技术。设计和实现的Gfiddaen数据网格系统,可以管理多个分布异构的存储资源的数据,为用户提供统一的数据访问。重点介绍了数据网格系统体系结构,以及系统的设计原则和目标,并讨论了系统主要关键技术的实现。  相似文献   

9.
刘彩燕  白尚旺 《计算机工程与设计》2006,27(17):3163-3164,3177
数据是网格中的一类重要资源,它具有可移动、可复制、可缓存等特点。随着网格中的数据集在数量上和规模上的不断增长,为提高访问速度和减少访问延迟,需要把网格数据传输到距离访问者比较近的位置,并在不同的网格结点上复制和存储数据集副本,使数据访问服务达到更好的性能和可靠性。网格中存在异构数据资源,通讯延迟和资源失效等问题,因此确保副本数据的一致性是一项极具挑战性的任务。通过对基于网格中间件系统的复制一致性服务及其存在的问题进行分析后,有针对性地提出了一种体系结构设计方案。  相似文献   

10.
一种基于Grid的多媒体内容分布框架   总被引:1,自引:0,他引:1  
基于互联网的大规模多媒体内容分发系统中,广泛地采用了数据复制技术来提高系统的性能。网格技术在构建大规模分布式信息系统方面,有着广阔的前景。本文提出了一种基于网格的多媒体内容分布框架,在网格基础服务的基础上,可以实现适用于流媒体应用的副本管理、传输和查找定位机制。此外,本文还对广域分布式信息系统中,副本的放置算法和用户请求调度机制等关键技术进行了深入探讨。  相似文献   

11.
Data replication and consistency refer to the same data being stored in distributed sites, and kept consistent when one or more copies are modified. A good file maintenance and consistency strategy can reduce file access times and access latencies, and increase download speeds, thus reducing overall computing times. In this paper, we propose dynamic services for replicating and maintaining data in grid environments, and directing replicas to appropriate locations for use. To address a problem with the Bandwidth Hierarchy-based Replication (BHR) algorithm, a strategy for maintaining replicas dynamically, we propose the Dynamic Maintenance Service (DMS). We also propose a One-way Replica Consistency Service (ORCS) for data grid environments, a positive approach to resolving consistency maintenance issues we hope will strike a balance between improving data access performance and replica consistency. Experimental results show that our services are more efficient than other strategies.  相似文献   

12.
The Data Grid provides massive aggregated computing resources and distributed storage space to deal with data-intensive applications. Due to the limitation of available resources in the grid as well as production of large volumes of data, efficient use of the Grid resources becomes an important challenge. Data replication is a key optimization technique for reducing access latency and managing large data by storing data in a wise manner. Effective scheduling in the Grid can reduce the amount of data transferred among nodes by submitting a job to a node where most of the requested data files are available. In this paper two strategies are proposed, first a novel job scheduling strategy called Weighted Scheduling Strategy (WSS) that uses hierarchical scheduling to reduce the search time for an appropriate computing node. It considers the number of jobs waiting in a queue, the location of the required data for the job and the computing capacity of the sites Second, a dynamic data replication strategy, called Enhanced Dynamic Hierarchical Replication (EDHR) that improves file access time. This strategy is an enhanced version of the Dynamic Hierarchical Replication strategy. It uses an economic model for file deletion when there is not enough space for the replica. The economic model is based on the future value of a data file. Best replica placement plays an important role for obtaining maximum benefit from replication as well as reducing storage cost and mean job execution time. So, it is considered in this paper. The proposed strategies are implemented by OptorSim, the European Data Grid simulator. Experiment results show that the proposed strategies achieve better performance by minimizing the data access time and avoiding unnecessary replication.  相似文献   

13.
In recent years data grids have been deployed and grown in many scientific experiments and data centers. The deployment of such environments has allowed grid users to gain access to a large number of distributed data. Data replication is a key issue in a data grid and should be applied intelligently because it reduces data access time and bandwidth consumption for each grid site. Therefore this area will be very challenging as well as providing much scope for improvement. In this paper, we introduce a new dynamic data replication algorithm named Popular File Group Replication, PFGR which is based on three assumptions: first, users in a grid site (Virtual Organization) have similar interests in files and second, they have the temporal locality of file accesses and third, all files are read-only. Based on file access history and first assumption, PFGR builds a connectivity graph for a group of dependent files in each grid site and replicates the most popular group files to the requester grid site. After that, when a user of that grid site needs some files, they are available locally. The simulation results show that our algorithm increases performance by minimizing the mean job execution time and bandwidth consumption and avoids unnecessary replication.  相似文献   

14.
网格的数据挖掘*   总被引:24,自引:2,他引:22  
网格是网络计算、分布式计算和高性能计算技术研究的热点。随着科学计算领域中的数据剧烈增长以及未来网格计算环境下广域分布的海量数据共享成为现实,数据挖掘技术将在挖掘有效的信息、发现新的知识和规律发挥着重要的作用。结合网格的特点,概述了网格数据挖掘的特点和关键技术,重点讨论了网格数据挖掘的体系结构和基本过程,最后给出了基于OGSA的网格数据挖掘的例子。  相似文献   

15.
Data Grid is a geographically distributed environment that deals with large-scale data-intensive applications. Effective scheduling in Grid can reduce the amount of data transferred among nodes by submitting a job to a node, where most of the requested data files are available. Data replication is another key optimization technique for reducing access latency and managing large data by storing data in a wisely manner. In this paper two algorithms are proposed, first a novel job scheduling algorithm called Combined Scheduling Strategy (CSS) that uses hierarchical scheduling to reduce the search time for an appropriate computing node. It considers the number of jobs waiting in queue, the location of required data for the job and the computing capacity of sites. Second a dynamic data replication strategy, called the Modified Dynamic Hierarchical Replication Algorithm (MDHRA) that improves file access time. This strategy is an enhanced version of Dynamic Hierarchical Replication (DHR) strategy. Data replication should be used wisely because the storage capacity of each Grid site is limited. Thus, it is important to design an effective strategy for the replication replacement. MDHRA replaces replicas based on the last time the replica was requested, number of access, and size of replica. It selects the best replica location from among the many replicas based on response time that can be determined by considering the data transfer time, the storage access latency, the replica requests that waiting in the storage queue and the distance between nodes. The simulation results demonstrate the proposed replication and scheduling strategies give better performance compared to the other algorithms.  相似文献   

16.
Replica Placement Strategies in Data Grid   总被引:1,自引:0,他引:1  
Replication is a technique used in Data Grid environments that helps to reduce access latency and network bandwidth utilization. Replication also increases data availability thereby enhancing system reliability. The research addresses the problem of replication in Data Grid environment by investigating a set of highly decentralized dynamic replica placement algorithms. Replica placement algorithms are based on heuristics that consider both network latency and user requests to select the best candidate sites to place replicas. Due to dynamic nature of Grid, the candidate site holds replicas currently may not be the best sites to fetch replicas in subsequent periods. Therefore, a replica maintenance algorithm is proposed to relocate replicas to different sites if the performance metric degrades significantly. The study of our replica placement algorithms is carried out using a model of the EU Data Grid Testbed 1 [Bell et al. Comput. Appl., 17(4), 2003] sites and their associated network geometry. We validate our replica placement algorithms with total file transfer times, the number of local file accesses, and the number of remote file accesses.  相似文献   

17.
Replica Management is a key issue to reduce the bandwidth consumption, to improve data availability and to maintain data consistency in large distributed systems. Global Replica Management (GRM) means to maintain the data consistency across the entire network. It is preferable particularly for multi-group distributed systems. On the other hand, GRM is not favorable for many applications because a very large number of message passes is needed for replica management processes. In this paper, in order to reduce the number of message passes needed to achieve the efficient GRM strategy, an interconnection structure called the Distributed Spanning Tree (DST) has been employed. The application of DST converts the peer network into logical layered structures and thereby provides a hierarchical mechanism for replication management. It is proved that this hierarchical approach improves the data availability and consistency across the entire network. In addition to these, it is also proved that the proposed approach reduces the data latency and the required number of message passes for any specific application in the network.  相似文献   

18.
Data grids support access to widely distributed storage for large numbers of users accessing potentially many large files. Efficient access is hindered by the high latency of the Internet. To improve access time, replication at nearby sites may be used. Replication also provides high availability, decreased bandwidth use, enhanced fault tolerance, and improved scalability. Resource availability, network latency, and user requests in a grid environment may vary with time. Any replica placement strategy must be able to adapt to such dynamic behavior. In this paper, we describe a new dynamic replica placement algorithm, Popularity Based Replica Placement (PBRP), for hierarchical data grids which is guided by file “popularity”. Our goal is to place replicas close to clients to reduce data access time while still using network and storage resources efficiently. The effectiveness of PBRP depends on the selection of a threshold value related to file popularity. We also present Adaptive-PBRP (APBRP) that determines this threshold dynamically based on data request arrival rates. We evaluate both algorithms using simulation. Results for a range of data access patterns show that our algorithms can shorten job execution time significantly and reduce bandwidth consumption compared to other dynamic replication methods.  相似文献   

19.
In recent years, grid technology has had such a fast growth that it has been used in many scientific experiments and research centers. A large number of storage elements and computational resources are combined to generate a grid which gives us shared access to extra computing power. In particular, data grid deals with data intensive applications and provides intensive resources across widely distributed communities. Data replication is an efficient way for distributing replicas among the data grids, making it possible to access similar data in different locations of the data grid. Replication reduces data access time and improves the performance of the system. In this paper, we propose a new dynamic data replication algorithm named PDDRA that optimizes the traditional algorithms. Our proposed algorithm is based on an assumption: members in a VO (Virtual Organization) have similar interests in files. Based on this assumption and also file access history, PDDRA predicts future needs of grid sites and pre-fetches a sequence of files to the requester grid site, so the next time that this site needs a file, it will be locally available. This will considerably reduce access latency, response time and bandwidth consumption. PDDRA consists of three phases: storing file access patterns, requesting a file and performing replication and pre-fetching and replacement. The algorithm was tested using a grid simulator, OptorSim developed by European Data Grid projects. The simulation results show that our proposed algorithm has better performance in comparison with other algorithms in terms of job execution time, effective network usage, total number of replications, hit ratio and percentage of storage filled.  相似文献   

20.
面向海量数据的数据一致性研究   总被引:6,自引:0,他引:6  
复制是实现海量数据管理的关键技术之一,多副本之间的数据一致性维护是提高分布式系统的容错能力与性能的重要保证。强一致性确保并发的修改操作不会发生冲突,但是限制了系统的可用性、连通性以及副本数量;弱一致性确保副本的最终一致,提高了系统的容错能力。本文从已有的一致性维护方法出发,结合海量数据的特点,对一致性维护过程中所涉及的更新发布、更新传播方式、更新传播内容以及更新冲突解决等几个方面进行了分析,提出了相应的解决方法。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号