首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
数据副本管理是云计算系统管理的重要组成部分,在云计算系统的海量数据处理过程中,针对目前已知的数据存放与资源调度算法存在考虑副本动态性和可靠性的不足,提出了一种动态的副本放置机制。该机制基于区域结构,考虑数据处理时其副本的数量和放置位置,以及副本的产生对于内存和带宽等系统资源的开销:首先根据云存储中的副本信息,对被访问频率高且访问平均响应时间长的数据信息进行复制,并给出副本数量的计算方法;考虑缩小副本分布的节点选择范围,提出动态的副本放置算法——DRA,将一定范围内的节点根据提出的域的划分,进行放置筛选,以存放数据副本。实验结果表明,提出的动态放置机制不仅减少了低访问率副本对系统存储空间的浪费;同时也减少了高访问率副本所需跨节点的传输延迟,有效提高了云存储系统中的数据文件的访问效率、负载的均衡水平,以及云存储系统的可靠性和可用性。  相似文献   

2.
以系统总响应时间最小化为目标,以文件热度为依据,提出了一种多时间窗负载均衡策略。在计算文件热度时,不仅考虑了访问的次数和大小,还将I/O访问时序引入到文件热度统计中,该方法能有效控制短时间突发性数据访问导致的不必要副本创建。在多时间窗负载均衡策略中,设置了三种不同大小的时间窗口,分别实现了存储节点负载均衡、文件副本的负载均衡以及低热度文件多余副本的删除工作。实验数据表明,多时间窗负载均衡策略能显著降低I/O访问响应时间。  相似文献   

3.
Effective data management is an important issue for a large-scale distributed environment such as data cloud. This can be achieved by using file replication, which efficiently reduces file service time and access latency, increases file availability and improves system load balancing. However, replication entails various costs such as storage and energy consumption for holding replicas. This article proposes a multi-objective offline optimization approach for replica management, in which we view the various factors influencing replication decisions such as mean file unavailability, mean service time, load variance, energy consumption and mean access latency as five objectives. It makes decisions of replication factor and replication layout with an improved artificial immune algorithm that evolves a set of solution candidates through clone, mutation and selection processes. The proposed algorithm named Multi-objective Optimized Replication Management (MORM) seeks the near optimal solutions by balancing the trade-offs among the five optimization objectives. The article reports a series of experiments that show the effectiveness of the MORM. Experimental results conclusively demonstrate that our MORM is energy effective and outperforms default replication management of HDFS (Hadoop Distributed File System) and MOE (Multi-objective Evolutionary) algorithm in terms of performance and load balancing for large-scale cloud storage cluster.  相似文献   

4.
Cloud computing environment is getting more interesting as a new trend of data management. Data replication has been widely applied to improve data access in distributed systems such as Grid and Cloud. However, due to the finite storage capacity of each site, copies that are useful for future jobs can be wastefully deleted and replaced with less valuable ones. Therefore, it is considerable to have appropriate replication strategy that can dynamically store the replicas while satisfying quality of service (QoS) requirements and storage capacity constraints. In this paper, we present a dynamic replication algorithm, named hierarchical data replication strategy (HDRS). HDRS consists of the replica creation that can adaptively increase replicas based on exponential growth or decay rate, the replica placement according to the access load and labeling technique, and finally the replica replacement based on the value of file in the future. We evaluate different dynamic data replication methods using CloudSim simulation. Experiments demonstrate that HDRS can reduce response time and bandwidth usage compared with other algorithms. It means that the HDRS can determine a popular file and replicates it to the best site. This method avoids useless replications and decreases access latency by balancing the load of sites.  相似文献   

5.
黄冬梅  杜艳玲  贺琪  随宏运  李瑶 《计算机科学》2018,45(6):72-75, 104
数据的完整性和可靠性是保证其能被高效访问的关键,尤其是在云存储环境中,数据副本策略是影响系统性能和保障数据可用性的核心。从数据副本布局的角度,提出了基于多属性最优化的数据副本布局策略(Data Replica Layout Strategy based on Multiple Attribute Optimization,MAO-DRLS)。该策略根据数据的访问热度和存储节点的关键属性特点,为每个数据设置动态的副本数,并选择合适的节点对副本进行布局。实验表明,MAO-DRLS策略能够有效地提升数据副本的利用率,缩短系统的响应时间。  相似文献   

6.
云存储技术已经成为当前互联网中共享存储和数据服务的基础技术,云存储系统普遍利用数据复制来提高数据可用性,增强系统容错能力和改善系统性能。提出了一种云存储系统中基于分簇的数据复制策略,该策略包括产生数据复制的时机判断、复制副本数量的决定以及如何放置复制所产生的数据副本。在放置数据副本时,设计了一种基于分簇的负载均衡副本放置方法。相关的仿真实验表明,提出的基于分簇的负载均衡副本放置方法是可行的,并且具有良好的性能。  相似文献   

7.
为有效管理云计算环境中的数据副本,减少系统带宽消耗、最小化响应时间和平衡负载,提出了一种动态副本管理策略.通过建立文件可用性和副本数量间的关系模型来计算系统应维持的最小副本数量;基于数据被请求访问的次数和传输花费进行副本放置;数据被请求时结合节点间带宽和节点效用选择副本.实验结果表明了该策略的正确性有效性.  相似文献   

8.
数据副本管理机制是云存储系统的重要组成部分。为了提高云存储系统的可伸缩性、可靠性,同时改善用户访问时间,通常采用多数据副本机制,并且需要解决数据副本放置问题。为此,提出了一种用于云存储系统的智能多数据副本放置机制。该机制基于p-中心模型,以最小化访问代价为优化目标,基于遗传算法(genetic algorithm,GA)确定优化的数据副本放置方案,基于生物地理学优化(biogeography-based optimization,BBO)算法确定用户访问请求对数据副本的优化分配。基于CloudSim进行了仿真实现和性能评价,结果表明,云存储智能多数据副本放置机制是可行和有效的。  相似文献   

9.
基于蚂蚁算法的数据网格副本选择策略   总被引:3,自引:0,他引:3  
在分布着大量数据和计算能力的数据网格环境中,采用数据副本是提高网格应用可用性的重要方法。如何对数据网格中大量的数据副本进行优化选择是影响数据网格性能的重要因素。因此提出一种基于蚂蚁算法的数据网格副本选择策略,并在网格仿真器OptorSim中对该算法进行实现及性能分析。仿真实验结果表明该算法可以减少数据访问延迟及带宽消耗,并有效做到网格中存储节点间的负载平衡。  相似文献   

10.
在云计算环境下分布式存储系统中,通常采用副本技术保证存储系统的可用性和可靠性,放置策略是副本技术的一个关键问题。针对现有副本放置策略中存在的副本访问开销大的问题,提出一种基于离散型萤火虫优化的副本放置算法。考虑副本放置对用户访问性能的影响,对其建立数学模型,计算萤火虫位置的适应度函数,并朝着荧光素值最大即最优值移动,进而得到合适的副本放置节点。通过仿真实验评估提出的方法,并与基于蚁群算法的副本放置策略进行比较。实验结果证明该算法能够选择合适的副本放置节点,具有较好的收敛性,并有效地降低存储系统的副本访问开销。  相似文献   

11.
In this paper, we propose a simulation model to study real‐world replication workflows for cloud storage systems. With this model, we present three new methods to maximize the storage space usage during replica creation, and two novel QoS aware greedy algorithms for replica placement optimization. By using a simulation method, our algorithms are evaluated, through a comparison with the existing placement algorithms, to show that (i) a more evenly distributed replicas for a data set can be achieved by using round‐robin methods in replica creation phase and (ii) the two proposed greedy algorithms, named GS_QoS and GS_QoS_C1, not only have more economical results than those from Chen et al., but also guarantee the QoS for clients. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

12.
网格环境下数据副本创建策略   总被引:6,自引:0,他引:6       下载免费PDF全文
本文在网格环境中探讨了数据副本创建策略,包括域内副本衍生和域间副本扩展策略。策略选择恰当的时机和地点创建副本,在充分利用存储资源的同时改善了用户的访问速度和带宽消耗。  相似文献   

13.
个体QoS受限的数据网格副本管理与更新方法   总被引:1,自引:0,他引:1  
数据网格系统通常采用副本技术提高系统总体性能,传统副本放置技术通过总体QoS需求确定副本数目和部署方式.针对QoS需求严格的一类数据网格应用,建立了个体服务质量受限的数据网格模型IQDG,提出一种启发式个体QoS受限的副本放置算法qGREP和基于逻辑环结构的一致性维护方法.IQDG采用的启发信息综合考虑了个体QoS约束的满足和副本开销的控制,能获得合理的副本策略.理论分析论证了算法的正确性和收敛性,模拟实验结果表明了算法能有效解决个体QoS受限的副本放置问题,在多种网络拓扑、访问模式和负载条件下均能取得较好的访问效果.  相似文献   

14.
Data grids support access to widely distributed storage for large numbers of users accessing potentially many large files. Efficient access is hindered by the high latency of the Internet. To improve access time, replication at nearby sites may be used. Replication also provides high availability, decreased bandwidth use, enhanced fault tolerance, and improved scalability. Resource availability, network latency, and user requests in a grid environment may vary with time. Any replica placement strategy must be able to adapt to such dynamic behavior. In this paper, we describe a new dynamic replica placement algorithm, Popularity Based Replica Placement (PBRP), for hierarchical data grids which is guided by file “popularity”. Our goal is to place replicas close to clients to reduce data access time while still using network and storage resources efficiently. The effectiveness of PBRP depends on the selection of a threshold value related to file popularity. We also present Adaptive-PBRP (APBRP) that determines this threshold dynamically based on data request arrival rates. We evaluate both algorithms using simulation. Results for a range of data access patterns show that our algorithms can shorten job execution time significantly and reduce bandwidth consumption compared to other dynamic replication methods.  相似文献   

15.
Data Grid is a geographically distributed environment that deals with large-scale data-intensive applications. Effective scheduling in Grid can reduce the amount of data transferred among nodes by submitting a job to a node, where most of the requested data files are available. Data replication is another key optimization technique for reducing access latency and managing large data by storing data in a wisely manner. In this paper two algorithms are proposed, first a novel job scheduling algorithm called Combined Scheduling Strategy (CSS) that uses hierarchical scheduling to reduce the search time for an appropriate computing node. It considers the number of jobs waiting in queue, the location of required data for the job and the computing capacity of sites. Second a dynamic data replication strategy, called the Modified Dynamic Hierarchical Replication Algorithm (MDHRA) that improves file access time. This strategy is an enhanced version of Dynamic Hierarchical Replication (DHR) strategy. Data replication should be used wisely because the storage capacity of each Grid site is limited. Thus, it is important to design an effective strategy for the replication replacement. MDHRA replaces replicas based on the last time the replica was requested, number of access, and size of replica. It selects the best replica location from among the many replicas based on response time that can be determined by considering the data transfer time, the storage access latency, the replica requests that waiting in the storage queue and the distance between nodes. The simulation results demonstrate the proposed replication and scheduling strategies give better performance compared to the other algorithms.  相似文献   

16.
一种基于时移电视系统的副本放置策略   总被引:1,自引:0,他引:1  
针对时移电视系统中的副本放置问题,通过将流式传输的数据片副本有计划地放置在路由节点缓存空间内,以提高用户体验度,减少初始化时延和频道切换响应时延。研究了副本放置问题中的负载均衡和响应时延最小化等问题,总结了节点连接度与副本放置之间的关系,在分析网络电视数据特点的基础上,提出一种混合副本放置策略。仿真结果表明,该策略能够有效提高查询消息搜索成功率,且占用缓存空间较小。  相似文献   

17.
王鑫  孟雨  覃琴  蒋华 《计算机应用研究》2020,37(4):1111-1114
为了提高云计算数据调度和副本访问的效率,对副本策略中的副本放置问题进行研究,提出一种基于蚁群算法的副本放置策略。根据自然界中蚁群觅食的原理,把蚁群算法应用于副本放置的整个过程; 利用信息素的动态更新以及拉普拉斯概率分布改进的蚁群算法得出一组最优解进行副本放置。在CloudSim平台上进行了仿真模拟,实验结果表明,提出的方案在平均作业完成时间、网络利用率和负载均衡度上均优于原始蚁群算法,并在一定程度上降低了副本放置的时间消耗和网络负载。  相似文献   

18.
基于层次化调度策略和动态数据复制的网格调度方法   总被引:2,自引:0,他引:2  
针对在网格中如何有效地进行任务调度和数据复制, 以便减少任务执行时间等问题, 提出了任务调度算法(ISS)和优化动态数据复制算法(ODHRA), 并构建一个方案将两种算法进行了有效结合。该方案采用ISS算法综合考虑任务等待队列的数量、任务需求数据的位置和站点的计算容量, 采用网络结构分级调度的方式, 配以适当的权重系数计算综合任务成本, 搜索出最佳计算节点区域; 采用ODHRA算法分析数据传输时间、存储访问延迟、等待在存储队列中的副本请求和节点间的距离, 在众多的副本中选取出最佳副本位置, 再结合副本放置和副本管理, 从而降低了文件访问时间。仿真结果表明, 提出的方案在平均任务执行时间方面, 与其他算法相比表现出了更好的性能。  相似文献   

19.
分布式文件系统HDFS采用机架感知的副本放置策略在一定程度上保证了数据的可靠性,但系统运行一段时间后会出现数据分布不均衡的情况.虽然使用Balancer程序可以对数据进行重分布,但对数据存储不均衡处理的后置性影响了系统的数据读取速率和可靠性.采用多层一致性哈希的副本放置策略,首先通过一致性哈希算法获得数据副本对应的机架位置,再通过一致性哈希算法获得该机架下对应的数据节点位置并最终成为存储位置.一致性哈希算法在查找对应位置的过程中采用地址等分和虚拟节点的技术,提高了查找的效率和分布的均衡性.该策略在数据均衡存储、上传速率方面较原有策略都有很大的提高,并且具有数据自适应性的能力.  相似文献   

20.
Data Grid provides scalable infrastructure for storage resource and data files management, which supports several large scale applications. Due to limitation of available resources in grid, efficient use of the grid resources becomes an important challenge. Replication is a technique used in data grid to improve fault tolerance and to reduce the bandwidth consumption. This paper proposes a Dynamic Hierarchical Replication (DHR) algorithm that places replicas in appropriate sites i.e. best site that has the highest number of access for that particular replica. It also minimizes access latency by selecting the best replica when various sites hold replicas. The proposed replica selection strategy selects the best replica location for the users' running jobs by considering the replica requests that waiting in the storage and data transfer time. The simulated results with OptorSim, i.e. European Data Grid simulator show that DHR strategy gives better performance compared to the other algorithms and prevents unnecessary creation of replica which leads to efficient storage usage.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号