首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 218 毫秒
1.
王巍  司加全  玄世昌  杨武 《计算机工程》2012,38(11):111-113
提出一种基于查询请求分析的P2P网络访问热点的负载均衡方法。分析历史查询数据的时间序列,采用单指数平滑法预测未来的热点。根据访问热点的查询请求行为,采取端到端复制和轻负载邻居节点复制相结合的策略进行负载均衡,消除由访问热点造成的负载不均衡问题。实验结果验证了该方法的有效性。  相似文献   

2.
一种层次结构化P2P网络中的负载均衡方法   总被引:1,自引:0,他引:1  
相对于扁平结构化P2P网络,层次结构化P2P网络可利用稳定、高性能的超级节点提高P2P网络在动态环境下的性能.然而,超级节点的负载不均是层次结构化P2P网络面临的基本问题之一.对此,作者提出一种超级节点的负载均衡方法,通过分离超级节点负责的关键字空间和负责的叶子节点空间来为均衡负载提供条件,通过采用"力矩平衡原理"来实现兼顾均衡超级节点负责的叶子节点空间和查询请求负载.实验结果表明:在节点承载容量服从Zipf分布和查找请求服从正态分布或Pareto分布的环境下,负载均衡方法可使超级节点的负载达到较好的均衡,实现了用较少的超级节点承担较大的负载总量.  相似文献   

3.
为有效管理云计算环境中的数据副本,减少系统带宽消耗、最小化响应时间和平衡负载,提出了一种动态副本管理策略.通过建立文件可用性和副本数量间的关系模型来计算系统应维持的最小副本数量;基于数据被请求访问的次数和传输花费进行副本放置;数据被请求时结合节点间带宽和节点效用选择副本.实验结果表明了该策略的正确性有效性.  相似文献   

4.
在云存储中心, 由于节点失效带来的文件数据块副本丢失不仅会影响系统的可靠性, 还会影响文件的并发访问效率. 针对Hadoop中默认的副本复制方法存在的问题, 即副本复制过程某些节点数据传输过于集中, 负载不均衡, 磁盘I/O吞吐率低, 提出一种基于热度的快速副本复制算法. 该算法优先复制热度高的数据块, 合理选择数据块复制的源节点和目的节点. 仿真结果表明, 该算法平衡了系统的工作负载, 提高了磁盘I/O吞吐率, 显著降低用户请求平均响应时间.  相似文献   

5.
以系统总响应时间最小化为目标,以文件热度为依据,提出了一种多时间窗负载均衡策略。在计算文件热度时,不仅考虑了访问的次数和大小,还将I/O访问时序引入到文件热度统计中,该方法能有效控制短时间突发性数据访问导致的不必要副本创建。在多时间窗负载均衡策略中,设置了三种不同大小的时间窗口,分别实现了存储节点负载均衡、文件副本的负载均衡以及低热度文件多余副本的删除工作。实验数据表明,多时间窗负载均衡策略能显著降低I/O访问响应时间。  相似文献   

6.
基于P2P系统的动态负载均衡算法   总被引:1,自引:0,他引:1  
在现实的P2P网络环境中,由于节点的计算能力和带宽等方面的异构性,网络负载不均衡现象非常突出.基于数据复制/转移策略,提出一种动态的平衡算法.根据节点的能力,当前节点负载状态、负载转移代价预估算,在整个系统范围内找到一组传输代价较小并且负载较轻的节点集合,从中随机选取较为适宜的节点进行负载转移或者数据复制.试验结果表明,该算法能够有效地均衡负载的分布以及降低负载的迁移率.  相似文献   

7.
当集群中的部分节点是廉价主机时,采用HDFS的随机存储策略可能使访问频率高的数据存储在廉价节点上,受到廉价节点的性能影响,访问时间过长,降低了集群效率。为改善以上问题,提出一种改进的副本分级存储调度策略。为减少副本调度的次数,先根据节点的CPU、内存、网络、存储负载以及网络距离来评价节点的性能,再从中选取高性能节点进行存储。副本调度以节点中副本的访问频率为依据,结合硬件配置,把访问频率高的副本尽可能存储在高性能、高配置的节点中,以加快集群响应速度。实验结果表明,改进后的策略可以在异构集群中提高副本的访问效率,优化负载均衡。  相似文献   

8.
在结构化对等网络中,负载失衡现象的产生将会造成单点失效、网络拥塞、请求延迟甚至是系统瘫痪等系统应用性能问题。本文针对传统复制算法中存在的无有效的预防热点策略、忽视节点性能差异性和副本维护机制等问题,提出基于节点性能分类和数据,节点分析的自适应负载均衡算法。通过实验模拟证明本文算法能够有效地实现结构化对等网络的负载均衡,降低数据请求报文的丢失率,从而有效地提高系统节点的利用率。  相似文献   

9.
在大规模分布式存储系统的容错技术中,数据副本管理是一种重要机制.针对网络环境中的动态副本管理需求,建立一种文件支持度指标及其动态计算模型.该模型通过周期性数据采集,利用文件支持度的自相关性,结合文件上一采集周期访问量、访问量占比、被访问数据量以及文件级别等参数,构建了能够较准确描述文件的动态副本需求状态模型.通过动态适应性的参数调整以适应变化的负载状态,使副本管理决策尽可能反映系统实际状态.在此基础上设计了数据结点负载均衡、副本调整、副本清理等相关算法,实现了动态副本管理的目标.通过实验验证了所设计的动态副本管理机制的有效性.  相似文献   

10.
本文在深入研究已有负载均衡策略的基础上,提出了一种建立在结构化P2P上基于Chord的自适应高可用性混合负载均衡策略--RGP(Replication and Gossip Policy).该策略结合了流言传播与动态副本策略,并将负载失衡分为轻负载和重负载区分对待,以提高结构化P2P网络的性能.  相似文献   

11.
IRM: Integrated File Replication and Consistency Maintenance in P2P Systems   总被引:1,自引:0,他引:1  
In peer-to-peer file sharing systems, file replication and consistency maintenance are widely used techniques for high system performance. Despite significant interdependencies between them, these two issues are typically addressed separately. Most file replication methods rigidly specify replica nodes, leading to low replica utilization, unnecessary replicas and hence extra consistency maintenance overhead. Most consistency maintenance methods propagate update messages based on message spreading or a structure without considering file replication dynamism, leading to inefficient file update and hence high possibility of outdated file response. This paper presents an Integrated file Replication and consistency Maintenance mechanism (IRM) that integrates the two techniques in a systematic and harmonized manner. It achieves high efficiency in file replication and consistency maintenance at a significantly low cost. Instead of passively accepting replicas and updates, each node determines file replication and update polling by dynamically adapting to time-varying file query and update rates, which avoids unnecessary file replications and updates. Simulation results demonstrate the effectiveness of IRM in comparison with other approaches. It dramatically reduces overhead and yields significant improvements on the efficiency of both file replication and consistency maintenance approaches.  相似文献   

12.
Effective data management is an important issue for a large-scale distributed environment such as data cloud. This can be achieved by using file replication, which efficiently reduces file service time and access latency, increases file availability and improves system load balancing. However, replication entails various costs such as storage and energy consumption for holding replicas. This article proposes a multi-objective offline optimization approach for replica management, in which we view the various factors influencing replication decisions such as mean file unavailability, mean service time, load variance, energy consumption and mean access latency as five objectives. It makes decisions of replication factor and replication layout with an improved artificial immune algorithm that evolves a set of solution candidates through clone, mutation and selection processes. The proposed algorithm named Multi-objective Optimized Replication Management (MORM) seeks the near optimal solutions by balancing the trade-offs among the five optimization objectives. The article reports a series of experiments that show the effectiveness of the MORM. Experimental results conclusively demonstrate that our MORM is energy effective and outperforms default replication management of HDFS (Hadoop Distributed File System) and MOE (Multi-objective Evolutionary) algorithm in terms of performance and load balancing for large-scale cloud storage cluster.  相似文献   

13.
The Data Grid provides massive aggregated computing resources and distributed storage space to deal with data-intensive applications. Due to the limitation of available resources in the grid as well as production of large volumes of data, efficient use of the Grid resources becomes an important challenge. Data replication is a key optimization technique for reducing access latency and managing large data by storing data in a wise manner. Effective scheduling in the Grid can reduce the amount of data transferred among nodes by submitting a job to a node where most of the requested data files are available. In this paper two strategies are proposed, first a novel job scheduling strategy called Weighted Scheduling Strategy (WSS) that uses hierarchical scheduling to reduce the search time for an appropriate computing node. It considers the number of jobs waiting in a queue, the location of the required data for the job and the computing capacity of the sites Second, a dynamic data replication strategy, called Enhanced Dynamic Hierarchical Replication (EDHR) that improves file access time. This strategy is an enhanced version of the Dynamic Hierarchical Replication strategy. It uses an economic model for file deletion when there is not enough space for the replica. The economic model is based on the future value of a data file. Best replica placement plays an important role for obtaining maximum benefit from replication as well as reducing storage cost and mean job execution time. So, it is considered in this paper. The proposed strategies are implemented by OptorSim, the European Data Grid simulator. Experiment results show that the proposed strategies achieve better performance by minimizing the data access time and avoiding unnecessary replication.  相似文献   

14.
Replication techniques for transaction-based distributed systems generally achieve increased availability but with a significant performance penalty. We present a new replication paradigm, the location-based paradigm, which addresses availability and other performance issues. It provides availability similar to quorum-based replication protocols but with transaction-execution delays similar to one-copy systems. The paradigm further exploits replication to improve performance in two instances. First, it takes advantage of local or nearby replicas to further improve the response time of transactions, achieving smaller execution delays than one-copy systems. Second, it takes advantage of replication to facilitate the independent crash recovery of replica sites-a goal which is unattainable in one-copy systems. In addition to the above the location-based paradigm avoids bottlenecks, facilitates load balancing, and minimizes the disruption of service when failures and recoveries occur. In this paper we present the paradigm, a formal proof of correctness, and a detailed simulation study comparing our paradigm to one-copy systems and to other approaches to replication control  相似文献   

15.
Cloud computing environment is getting more interesting as a new trend of data management. Data replication has been widely applied to improve data access in distributed systems such as Grid and Cloud. However, due to the finite storage capacity of each site, copies that are useful for future jobs can be wastefully deleted and replaced with less valuable ones. Therefore, it is considerable to have appropriate replication strategy that can dynamically store the replicas while satisfying quality of service (QoS) requirements and storage capacity constraints. In this paper, we present a dynamic replication algorithm, named hierarchical data replication strategy (HDRS). HDRS consists of the replica creation that can adaptively increase replicas based on exponential growth or decay rate, the replica placement according to the access load and labeling technique, and finally the replica replacement based on the value of file in the future. We evaluate different dynamic data replication methods using CloudSim simulation. Experiments demonstrate that HDRS can reduce response time and bandwidth usage compared with other algorithms. It means that the HDRS can determine a popular file and replicates it to the best site. This method avoids useless replications and decreases access latency by balancing the load of sites.  相似文献   

16.
Data Grid is a geographically distributed environment that deals with large-scale data-intensive applications. Effective scheduling in Grid can reduce the amount of data transferred among nodes by submitting a job to a node, where most of the requested data files are available. Data replication is another key optimization technique for reducing access latency and managing large data by storing data in a wisely manner. In this paper, two algorithms are proposed: first, a novel job scheduling algorithm called Combined Scheduling Strategy (CSS) that considers the number of jobs waiting in queue, the location of required data for the job, and computational capability; second, a dynamic data replication strategy called Dynamic Hierarchical Replication Algorithm (DHRA) that improves file access time. DHRA stores each replica in an appropriate site, i.e., appropriate site in the requested region that has the highest number of access for that particular replica. Also, it can minimize access latency by selecting the best replica when various sites hold replicas of datasets. The simulation results demonstrate the proposed replication and scheduling strategies give better performance compared to the other algorithms.  相似文献   

17.
Cloud computing is becoming a very popular word in industry and is receiving a large amount of attention from the research community. Replica management is one of the most important issues in the cloud, which can offer fast data access time, high data availability and reliability. By keeping all replicas active, the replicas may enhance system task successful execution rate if the replicas and requests are reasonably distributed. However, appropriate replica placement in a large-scale, dynamically scalable and totally virtualized data centers is much more complicated. To provide cost-effective availability, minimize the response time of applications and make load balancing for cloud storage, a new replica placement is proposed. The replica placement is based on five important parameters: mean service time, failure probability, load variance, latency and storage usage. However, replication should be used wisely because the storage size of each site is limited. Thus, the site must keep only the important replicas.We also present a new replica replacement strategy based on the availability of the file, the last time the replica was requested, number of access, and size of replica. We evaluate our algorithm using the CloudSim simulator and find that it offers better performance in comparison with other algorithms in terms of mean response time, effective network usage, load balancing, replication frequency, and storage usage.  相似文献   

18.
Data Grid is a geographically distributed environment that deals with large-scale data-intensive applications. Effective scheduling in Grid can reduce the amount of data transferred among nodes by submitting a job to a node, where most of the requested data files are available. Data replication is another key optimization technique for reducing access latency and managing large data by storing data in a wisely manner. In this paper two algorithms are proposed, first a novel job scheduling algorithm called Combined Scheduling Strategy (CSS) that uses hierarchical scheduling to reduce the search time for an appropriate computing node. It considers the number of jobs waiting in queue, the location of required data for the job and the computing capacity of sites. Second a dynamic data replication strategy, called the Modified Dynamic Hierarchical Replication Algorithm (MDHRA) that improves file access time. This strategy is an enhanced version of Dynamic Hierarchical Replication (DHR) strategy. Data replication should be used wisely because the storage capacity of each Grid site is limited. Thus, it is important to design an effective strategy for the replication replacement. MDHRA replaces replicas based on the last time the replica was requested, number of access, and size of replica. It selects the best replica location from among the many replicas based on response time that can be determined by considering the data transfer time, the storage access latency, the replica requests that waiting in the storage queue and the distance between nodes. The simulation results demonstrate the proposed replication and scheduling strategies give better performance compared to the other algorithms.  相似文献   

19.
This work addresses the optimization of file locality, file availability, and replica migration cost in a Hadoop architecture. Our optimization algorithm is based on the Non-dominated Sorting Genetic Algorithm-II and it simultaneously determines file block placement, with a variable replication factor, and MapReduce job scheduling. Our proposal has been tested with experiments that considered three data center sizes (8, 16 and 32 nodes) with the same workload and number of files (150 files and 3519 file blocks). In general terms, the use of a placement policy with a variable replica factor obtains higher improvements for our three optimization objectives. On the contrary, the use of a job scheduling policy only improves these objectives when it is used along a variable replication factor. The results have also shown that the migration cost is a suitable optimization objective as significant improvements up to 34% have been observed between the experiments.  相似文献   

20.
基于DHT的P2P系统中,各种因素例如结点异构性和不同的文件访问率等,都可能会影响DHT系统的效率。本文提出一个基于DHT的P2P系统中有效的负载均衡算法。该算法提出一个全分布机制来维护文件访问的历史信息,用来预测未来文件访问频率。设计了一个新的负载均衡算法,当新结点加入时,历史信息和结点异构性一起用来决定最佳负载分配。在系统运行期间如果有过载结点出现也可动态执行负载重分配。该算法不使用虚服务器,减少了维护路由元数据的处理开销。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号