首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 156 毫秒
1.
基于遗传算法的副本管理策略   总被引:2,自引:0,他引:2  
在数据网格环境中,为了提高数据的可靠性和降低用户对数据访问的延迟,广泛采用了数据复制技术;由于副本技术的引入,就会面临一个很直接的问题,即一个应用程序如何根据副本性能和访问特性,从一组副本中选择一个最佳副本,就是副本选择问题.针对数据复制技术中的关键技术:副本创建和副本选择,在价格机制模型和并行数据传输的基础上,采用了遗传算法实现副本的优化管理.最后使用网格模拟器OptorSim对算法进行测试分析,结果显示基于遗传算法的策略有更好的性能.  相似文献   

2.
说明了有效的数据复制机制和副本管理策略是信息网格研究的重要内容.概述了应用Agent技术实现信息网格中数据复制的基本方法.探讨了在全国人口信息系统中基于Agent的数据复制方法的设计和实现,利用Agent的自主性、移动性和交互性等特点,实现上层创建下层副本,及同层间副本的创建.采用数据格式转换系统解决底层数据库的异构问题.重点分析了基于Agent的推动和拉动复制方法的实现特点,说明了基于Agent的数据复制方法在信息网格中的合理性和高效性.  相似文献   

3.
副本管理成为影响数据网格性能的主要因素之一,研究高效的副本管理算法大都依赖于对数据网格副本管理进行仿真.介绍了一种数据网格副本管理仿真软件的设计与实现,并详细介绍了数据网格仿真的一些关键技术的解决方案,如任务调度、任务执行仿真.  相似文献   

4.
副本技术是网格中提高数据访问和处理效率的关键技术。针对目前副本管理存在的局限性和亟待解决的一致性维护关键问题,以Globus提供的副本技术为基础,从副本创建与更新的角度出发,采用日志管理思想,提出了一种副本创建与一致性维护相结合的解决方案。  相似文献   

5.
网格中涉及到大量数据文件的复制和传输,数据的有效复制可以节省带宽、减少时延、均衡负载、改善系统可靠性。文中提出的采用基于访问频率的复制管理技术可以使用户有效地获得所需的数据,并采用一种改进的Fast Upload方法来更新文件的副本,减少了复制和更新所用的时间,提高系统的效率。  相似文献   

6.
由于全球海洋实时观测网Argo的数据量十分巨大,为了有效管理和使用这些数据,在分析Argo数据格式及特点的基础上,提出了利用网格技术对Argo数据进行共享管理的方法.应用网格的元数据管理、副本管理和数据访问与集成等关键技术来管理海量的Argo数据,使用户可以在任何地点、任何时间透明地访问和使用Argo海洋观测数据资源,从而为海洋科学研究获取海洋环境信息数据提供方便.  相似文献   

7.
沈薇  刘方爱 《微机发展》2006,16(11):185-187
网格中涉及到大量数据文件的复制和传输,数据的有效复制可以节省带宽、减少时延、均衡负载、改善系统可靠性。文中提出的采用基于访问频率的复制管理技术可以使用户有效地获得所需的数据,并采用一种改进的Fast Upload方法来更新文件的副本,减少了复制和更新所用的时间,提高系统的效率。  相似文献   

8.
数据网格环境下一种动态自适应的副本定位方法   总被引:10,自引:2,他引:10  
在数据网格中,数据常常会由于性能和可用性等原因进行复制,如何有效地定位数据的一个或多个副本的物理位置是数据网格系统需要解决的重要问题,提出了一种可扩展、动态自适应的分布副本定位方法——DSRL,DSRL使用宿主结点来支持对同一数据多个副本的同时高效定位,使用本地副本定位结点来支持对副本的本地查询。DSRL提出了一种动态均衡映射方法,将全局副本定位信息均衡分布在多个宿主结点上,并且能够自适应宿主结点的动态加人或退出,详细描述了DSRL的组成,并对DSRL方法的正确性和负载平衡等特性进行了证明,分析和实验表明,DSRL方法有着良好的可扩展性、可靠性、自适应性和性能,并且实现简单,有着较好的实用性。  相似文献   

9.
一种基于Grid的多媒体内容分布框架   总被引:1,自引:0,他引:1  
基于互联网的大规模多媒体内容分发系统中,广泛地采用了数据复制技术来提高系统的性能。网格技术在构建大规模分布式信息系统方面,有着广阔的前景。本文提出了一种基于网格的多媒体内容分布框架,在网格基础服务的基础上,可以实现适用于流媒体应用的副本管理、传输和查找定位机制。此外,本文还对广域分布式信息系统中,副本的放置算法和用户请求调度机制等关键技术进行了深入探讨。  相似文献   

10.
杨涛  刘贵全 《计算机仿真》2007,24(2):126-129
数据网格是网格环境下的一种数据管理和存储架构,通常使用数据复制技术来获得更好的数据访问效率和容错性能,提出了一种基于MAS的复制管理模型,解决数据网格中数据高度自治和动态带来的管理难题,探讨了基于MAS的实现架构,给出了Agent的结构和协作过程,将复制管理和复制优化策略封装于Agent智能模块中,结合实际应用使用Optorsim仿真器对模型和复制优化策略进行分析,并对经济模型的基于二项分布的估价函数进行了改进,仿真结果表明模型能够提供高效的复制管理服务.  相似文献   

11.
Data replication and consistency refer to the same data being stored in distributed sites, and kept consistent when one or more copies are modified. A good file maintenance and consistency strategy can reduce file access times and access latencies, and increase download speeds, thus reducing overall computing times. In this paper, we propose dynamic services for replicating and maintaining data in grid environments, and directing replicas to appropriate locations for use. To address a problem with the Bandwidth Hierarchy-based Replication (BHR) algorithm, a strategy for maintaining replicas dynamically, we propose the Dynamic Maintenance Service (DMS). We also propose a One-way Replica Consistency Service (ORCS) for data grid environments, a positive approach to resolving consistency maintenance issues we hope will strike a balance between improving data access performance and replica consistency. Experimental results show that our services are more efficient than other strategies.  相似文献   

12.
The Data Grid provides massive aggregated computing resources and distributed storage space to deal with data-intensive applications. Due to the limitation of available resources in the grid as well as production of large volumes of data, efficient use of the Grid resources becomes an important challenge. Data replication is a key optimization technique for reducing access latency and managing large data by storing data in a wise manner. Effective scheduling in the Grid can reduce the amount of data transferred among nodes by submitting a job to a node where most of the requested data files are available. In this paper two strategies are proposed, first a novel job scheduling strategy called Weighted Scheduling Strategy (WSS) that uses hierarchical scheduling to reduce the search time for an appropriate computing node. It considers the number of jobs waiting in a queue, the location of the required data for the job and the computing capacity of the sites Second, a dynamic data replication strategy, called Enhanced Dynamic Hierarchical Replication (EDHR) that improves file access time. This strategy is an enhanced version of the Dynamic Hierarchical Replication strategy. It uses an economic model for file deletion when there is not enough space for the replica. The economic model is based on the future value of a data file. Best replica placement plays an important role for obtaining maximum benefit from replication as well as reducing storage cost and mean job execution time. So, it is considered in this paper. The proposed strategies are implemented by OptorSim, the European Data Grid simulator. Experiment results show that the proposed strategies achieve better performance by minimizing the data access time and avoiding unnecessary replication.  相似文献   

13.
In recent years data grids have been deployed and grown in many scientific experiments and data centers. The deployment of such environments has allowed grid users to gain access to a large number of distributed data. Data replication is a key issue in a data grid and should be applied intelligently because it reduces data access time and bandwidth consumption for each grid site. Therefore this area will be very challenging as well as providing much scope for improvement. In this paper, we introduce a new dynamic data replication algorithm named Popular File Group Replication, PFGR which is based on three assumptions: first, users in a grid site (Virtual Organization) have similar interests in files and second, they have the temporal locality of file accesses and third, all files are read-only. Based on file access history and first assumption, PFGR builds a connectivity graph for a group of dependent files in each grid site and replicates the most popular group files to the requester grid site. After that, when a user of that grid site needs some files, they are available locally. The simulation results show that our algorithm increases performance by minimizing the mean job execution time and bandwidth consumption and avoids unnecessary replication.  相似文献   

14.
网格的数据挖掘*   总被引:24,自引:2,他引:22  
网格是网络计算、分布式计算和高性能计算技术研究的热点。随着科学计算领域中的数据剧烈增长以及未来网格计算环境下广域分布的海量数据共享成为现实,数据挖掘技术将在挖掘有效的信息、发现新的知识和规律发挥着重要的作用。结合网格的特点,概述了网格数据挖掘的特点和关键技术,重点讨论了网格数据挖掘的体系结构和基本过程,最后给出了基于OGSA的网格数据挖掘的例子。  相似文献   

15.
Data Grid is a geographically distributed environment that deals with large-scale data-intensive applications. Effective scheduling in Grid can reduce the amount of data transferred among nodes by submitting a job to a node, where most of the requested data files are available. Data replication is another key optimization technique for reducing access latency and managing large data by storing data in a wisely manner. In this paper two algorithms are proposed, first a novel job scheduling algorithm called Combined Scheduling Strategy (CSS) that uses hierarchical scheduling to reduce the search time for an appropriate computing node. It considers the number of jobs waiting in queue, the location of required data for the job and the computing capacity of sites. Second a dynamic data replication strategy, called the Modified Dynamic Hierarchical Replication Algorithm (MDHRA) that improves file access time. This strategy is an enhanced version of Dynamic Hierarchical Replication (DHR) strategy. Data replication should be used wisely because the storage capacity of each Grid site is limited. Thus, it is important to design an effective strategy for the replication replacement. MDHRA replaces replicas based on the last time the replica was requested, number of access, and size of replica. It selects the best replica location from among the many replicas based on response time that can be determined by considering the data transfer time, the storage access latency, the replica requests that waiting in the storage queue and the distance between nodes. The simulation results demonstrate the proposed replication and scheduling strategies give better performance compared to the other algorithms.  相似文献   

16.
Replica Placement Strategies in Data Grid   总被引:1,自引:0,他引:1  
Replication is a technique used in Data Grid environments that helps to reduce access latency and network bandwidth utilization. Replication also increases data availability thereby enhancing system reliability. The research addresses the problem of replication in Data Grid environment by investigating a set of highly decentralized dynamic replica placement algorithms. Replica placement algorithms are based on heuristics that consider both network latency and user requests to select the best candidate sites to place replicas. Due to dynamic nature of Grid, the candidate site holds replicas currently may not be the best sites to fetch replicas in subsequent periods. Therefore, a replica maintenance algorithm is proposed to relocate replicas to different sites if the performance metric degrades significantly. The study of our replica placement algorithms is carried out using a model of the EU Data Grid Testbed 1 [Bell et al. Comput. Appl., 17(4), 2003] sites and their associated network geometry. We validate our replica placement algorithms with total file transfer times, the number of local file accesses, and the number of remote file accesses.  相似文献   

17.
Replica Management is a key issue to reduce the bandwidth consumption, to improve data availability and to maintain data consistency in large distributed systems. Global Replica Management (GRM) means to maintain the data consistency across the entire network. It is preferable particularly for multi-group distributed systems. On the other hand, GRM is not favorable for many applications because a very large number of message passes is needed for replica management processes. In this paper, in order to reduce the number of message passes needed to achieve the efficient GRM strategy, an interconnection structure called the Distributed Spanning Tree (DST) has been employed. The application of DST converts the peer network into logical layered structures and thereby provides a hierarchical mechanism for replication management. It is proved that this hierarchical approach improves the data availability and consistency across the entire network. In addition to these, it is also proved that the proposed approach reduces the data latency and the required number of message passes for any specific application in the network.  相似文献   

18.
Data grids support access to widely distributed storage for large numbers of users accessing potentially many large files. Efficient access is hindered by the high latency of the Internet. To improve access time, replication at nearby sites may be used. Replication also provides high availability, decreased bandwidth use, enhanced fault tolerance, and improved scalability. Resource availability, network latency, and user requests in a grid environment may vary with time. Any replica placement strategy must be able to adapt to such dynamic behavior. In this paper, we describe a new dynamic replica placement algorithm, Popularity Based Replica Placement (PBRP), for hierarchical data grids which is guided by file “popularity”. Our goal is to place replicas close to clients to reduce data access time while still using network and storage resources efficiently. The effectiveness of PBRP depends on the selection of a threshold value related to file popularity. We also present Adaptive-PBRP (APBRP) that determines this threshold dynamically based on data request arrival rates. We evaluate both algorithms using simulation. Results for a range of data access patterns show that our algorithms can shorten job execution time significantly and reduce bandwidth consumption compared to other dynamic replication methods.  相似文献   

19.
面向海量数据的数据一致性研究   总被引:6,自引:0,他引:6  
复制是实现海量数据管理的关键技术之一,多副本之间的数据一致性维护是提高分布式系统的容错能力与性能的重要保证。强一致性确保并发的修改操作不会发生冲突,但是限制了系统的可用性、连通性以及副本数量;弱一致性确保副本的最终一致,提高了系统的容错能力。本文从已有的一致性维护方法出发,结合海量数据的特点,对一致性维护过程中所涉及的更新发布、更新传播方式、更新传播内容以及更新冲突解决等几个方面进行了分析,提出了相应的解决方法。  相似文献   

20.
Data replication techniques are used in data grid to reduce makespan, storage consumption, access latency and network bandwidth. Data replication enhances data availability and thereby increases the system reliability. There are two steps involved in data replication, namely, replica placement and replica selection. Replica placement involves identifying the best possible node to duplicate data based on network latency and user request. Replica selection involves selecting the best replica location to access the data for job execution in the data grid. Various replica placement and selection algorithms are available in the literature. These algorithms measure and analyze different parameters such as bandwidth consumption, access cost, scalability, execution time, storage consumption and makespan. In this paper, various replica placement and selection strategies along with their merits and demerits are discussed. This paper also analyses the performance of various strategies with respect to the parameters mentioned above. In particular, this paper focuses on the dynamic replica placement and selection strategies in the data grid environment.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号