首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
To efficiently execute a finite element program on a 2D torus, we need to map nodes of the corresponding finite element graph to processors of a 2D torus such that each processor has approximately the same amount of computational load and the communication among processors is minimized. If nodes of a finite element graph do not increase during the execution of a program, the mapping only needs to be performed once. However, if a finite element graph is solution-adaptive, that is, nodes of a finite element graph increase discretely due to the refinement of some finite elements during the execution of a program, a dynamic load-balancing algorithm has to be performed many times in order to balance the computational load of processors while keeping the communication cost as low as possible. In the paper we propose a parallel dynamic load-balancing algorithm (LB) to deal with the load-imbalancing problem of a solution-adaptive finite element program on a 2D torus. The algorithm uses an iterative approach to achieve load-balancing. We have implemented the proposed algorithm along with two parallel mapping algorithms, parallel orthogonal recursive bisection (ORB) and parallel recursive mincut bipartitioning (MC), on a simulated 2D torus. Three criteria, the execution time of load-balancing algorithms, the computation time of an application program under different load balancing algorithms, and the total execution time of an application program (under several refinement phases) are used for performance evaluation. Simulation results show that (1) the execution of LB is faster than those of MC and ORB; (2) the mappings of LB are better than those of ORB and MC; and (3) the speedups of LB are better than those of ORB and MC.  相似文献   

2.
Many scientific and high-performance computing applications consist of multiple processes running on different processors that communicate frequently. Because of their synchronization needs, these applications can suffer severe performance penalties if their processes are not all coscheduled to run together. Two common approaches to coscheduling jobs are batch scheduling, wherein nodes are dedicated for the duration of the run, and gang scheduling, wherein time slicing is coordinated across processors. Both work well when jobs are load-balanced and make use of the entire parallel machine. However, these conditions are rarely met and most realistic workloads consequently suffer from both internal and external fragmentation, in which resources and processors are left idle because jobs cannot be packed with perfect efficiency. This situation leads to reduced utilization and suboptimal performance. Flexible coscheduling (FCS) addresses this problem by monitoring each job's computation granularity and communication pattern and scheduling jobs based on their synchronization and load-balancing requirements. In particular, jobs that do not require stringent synchronization are identified, and are not coscheduled; instead, these processes are used to reduce fragmentation. FCS has been fully implemented on top of the STORM resource manager on a 256-processor alpha cluster and compared to batch, gang, and implicit coscheduling algorithms. This paper describes in detail the implementation of FCS and its performance evaluation with a variety of workloads, including large-scale benchmarks, scientific applications, and dynamic workloads. The experimental results show that FCS saturates at higher loads than other algorithms (up to 54 percent higher in some cases), and displays lower response times and slowdown than the other algorithms in nearly all scenarios.  相似文献   

3.
In a distributed computing environment, it is important to ensure that the processor workloads are adequately balanced. Among numerous load-balancing algorithms, a unique approach given by Das and Prasad defines a symmetric broadcast network (SBN) that provides a robust communication pattern among the processors in a topology-independent manner. In this paper, we propose and analyze three efficient SBN-based dynamic load-balancing algorithms, and implement them on an SGI Origin2000. A thorough experimental study with Poisson-distributed synthetic loads demonstrates that our algorithms are effective in balancing system load. By optimizing completion time and idle time, the proposed algorithms are shown to compare favorably with several existing approaches.  相似文献   

4.
Parallel code presents a non-trivial problem of load balancing computational workload throughout a system of hardware and software resources. The task of load balancing is further complicated when the number of allowable processors changes through time. This paper presents a two-component load-balancing mechanism using optimal initial workload distribution and dynamic load maintenance. The initial guess is provided by inversion of the workload distribution function. Workload distribution inversion enables efficient domain decomposition for arbitrary workloads and easily compensates for changes in system resources. Dynamic load balancing is provided by process feedback control as used, for example, in control mechanisms of physical processes. Proportional, integral, and differential (PID) feedback readily allows controls to compensate for runtime-changes of the workload distribution function. This paper demonstrates a one-dimensional realization of the ideas presented here. We apply this load-balancing technique to our gridless direct simulation Monte Carlo algorithm. We demonstrate that the method does indeed maintain uniform workload distribution across available resources as the workload and usable system resources undergo change through time.  相似文献   

5.
We describe a compiler and run-time system that allow data-parallel programs to execute on a network of heterogeneous UNIX workstations. The programming language supported is Dataparallel C, a SIMD language with virtual processors and a global name space. This parallel programming environment allows the user to take advantage of the power of multiple workstations without adding any message-passing calls to the source program. Because the performance of Individual workstations in a multi-user environment may change during the execution of a Dataparallel C program, the run-time system automatically performs dynamic load balancing. We present experimental results that demonstrate the usefulness of dynamic load-balancing In a multi-user environment These results suggest that initially allocating the same amount of work to each processor and letting the dynamic load balancing algorithm adjust the load during program execution yields very good performance. Hence neither the compiler nor the run-time system need a priori knowledge of the speeds of the machines that will participate in a program execution.  相似文献   

6.
Grid computing has emerged a new field, distinguished from conventional distributed computing. It focuses on large-scale resource sharing, innovative applications and in some cases, high performance orientation. The Grid serves as a comprehensive and complete system for organizations by which the maximum utilization of resources is achieved. The load balancing is a process which involves the resource management and an effective load distribution among the resources. Therefore, it is considered to be very important in Grid systems. For a Grid, a dynamic, distributed load balancing scheme provides deadline control for tasks. Due to the condition of deadline failure, developing, deploying, and executing long running applications over the grid remains a challenge. So, deadline failure recovery is an essential factor for Grid computing. In this paper, we propose a dynamic distributed load-balancing technique called “Enhanced GridSim with Load balancing based on Deadline Failure Recovery” (EGDFR) for computational Grids with heterogeneous resources. The proposed algorithm EGDFR is an improved version of the existing EGDC in which we perform load balancing by providing a scheduling system which includes the mechanism of recovery from deadline failure of the Gridlets. Extensive simulation experiments are conducted to quantify the performance of the proposed load-balancing strategy on the GridSim platform. Experiments have shown that the proposed system can considerably improve Grid performance in terms of total execution time, percentage gain in execution time, average response time, resubmitted time and throughput. The proposed load-balancing technique gives 7 % better performance than EGDC in case of constant number of resources, whereas in case of constant number of Gridlets, it gives 11 % better performance than EGDC.  相似文献   

7.
To solve the load imbalance problem of a solution-adaptive finite element application program on a distributed memory multicomputer, nodes of a refined finite element graph can be remapped to processors or load of a refined finite element graph can be redistributed based on the current load of each processor. For the former case, remapping can be performed by some fast mapping algorithms. For the latter case, a load-balancing algorithm can be applied to balance the computational load of each processor. In this paper, three tree-based parallel load-balancing methods, the MCSTLB method, the BTLB method, and the CBTLB method, were proposed to deal with the load imbalance problems of solution-adaptive finite element application programs. To evaluate the performance of the proposed methods, we have implemented those methods along with three mapping methods, the AE/ORB method, the AE/MC method, and the MLkP method, on an SP2 parallel machine. Three criteria, the execution time of mapping/load-balancing methods, the execution time of a solution-adaptive finite element application program under different mapping/load-balancing methods, and the speedups achieved by mapping/load-balancing methods for a solution-adaptive finite element application program, are used for the performance evaluation. The experimental results show that 1) if the initial mapping is performed by a mapping method and the same mapping method and load-balancing methods were used in each refinement to balance the load of processors, the execution time of an application program under a load-balancing method is always shorter than that of the mapping method, and 2) the execution time of an application program under the CBTLB method is shorter than that of the BTLB method and the MCSTLB method  相似文献   

8.
A trace-driven simulation study of dynamic load balancing   总被引:2,自引:0,他引:2  
A trace-driven simulation study of dynamic load balancing in homogeneous distributed systems supporting broadcasting is presented. Information about job CPU and input/output (I/O) demands collected from production systems is used as input to a simulation model that includes a representative CPU scheduling policy and considers the message exchange and job transfer cost explicitly. Seven load-balancing algorithms are simulated and their performances compared. Load balancing is capable of significantly reducing the mean and standard deviation of job response times, especially under heavy load, and for jobs with high resource demands. Algorithms based on periodic or nonperiodic load information exchange provide similar performance, and, among the periodic policies, the algorithms that use a distinguished agent to collect and distribute load information cut down the overhead and scale better. With initial job placements only, source initiative algorithms were found to perform better than server initiative algorithms. The performances of all hosts, even those originally with light loads, are generally improved by load balancing  相似文献   

9.
负载平衡是影响并行绘制效率的关键问题。提出了动态负载平衡算法两阶段映射的模型,给出了负载平衡性能的一种度量方法;还提出了一种最佳的任务调度算法,对该算法的性能进行了分析,得出绘制时间的理论上限值,同时给出了多任务划分的方法。  相似文献   

10.
针对在时间和空间上都具有高计算成本的长序列数据库,一个更有效和更紧凑且可以完全提取信息的挖掘模式是当前的研究热点。提出一种并行动态位向量频繁闭合序列模式的挖掘算法(PDBV FCSP),该算法采用多核处理器架构和DBV数据结构相结合的方式,有效加快了序列数据库的处理速度,并对搜索空间进行划分,尽早执行预处理序列的闭合检查,减少了所需的存储空间和挖掘频繁闭合序列模式的执行时间,克服了现有并行挖掘算法通信开销、同步和数据复制等问题。利用重新分配工作的动态负载平衡机制,解决处理器之间的负载均衡问题,最大限度地减少了CPU空闲时间。对DBV VDF算法和PDBV FCSP(2 4核)算法进行仿真比较,结果表明,PDBV FCSP算法在运行时间、内存使用和可伸缩性等方面都有较优的性能提升,且当内核数增加时,性能更优。  相似文献   

11.
提出一种基于树型计算网格的自适应调度算法,实现对小粒度独立任务和用户大作业的自适应最优调度。通过对网格环境的实时检测,给出了基于节点负载状况、节点任务执行时间、任务传输时间和任务特性的自适应调度算法,即基于最优任务分配方案的启发式任务调度算法。通过实验与其他调度算法的比较,证明了所提出的任务调度算法在负载平衡和最优跨度方面具有明显的优越性。  相似文献   

12.
应用于高性能计算领域的通用GPU拥有强大的并行计算能力,以通用GPU作为主处理器的数据分析系统相较于传统数据库能够提供更好的性能。在大数据场景下,如何根据CPU和GPU的资源在处理器之间合理分配工作负载是亟待解决的问题。提出了一种CPU GPU异构数据分析系统上的负载均衡处理策略。该策略采用流水线模型将工作负载分解,基于流水线设计了负载均衡模型,将工作负载合理分配至异构处理器,减少系统总执行时间开销,实现了性能提升。实验结果表明,提出的基于流水线的负载均衡模型能适应不同查询请求下的不同数据量场景,具有良好的性能。  相似文献   

13.
Scheduling jobs dynamically on processors is likely to achieve better performance in multiprocessor and distributed real-time systems. Exhaustive methods for determining whether all jobs complete by their deadlines, in systems that use modern priority-driven scheduling strategies, are often infeasible or unreliable since the execution time of each job may vary. We previously published research results on finding worst-case bounds and efficient algorithms for validating systems in which independent jobs have arbitrary release times and deadlines, and are scheduled on processors dynamically in a priority-driven manner. An efficient method has been proposed to determine how late the completion times of jobs can be in dynamic systems where the jobs are preemptable and nonmigratable. This paper further presents the performance characteristics of the proposed methods, and shows its soundness by providing extensive simulation results. The worst-case completion times of jobs obtained with the proposed methods are compared with the values by simulations under different workload characteristics. The simulation results show that the proposed algorithm performs considerably well for diverse workloads. Considering the previous work showed the unlikelihood of finding tighter bounds than the one given in the paper, the simulation results indicate that the proposed methods effectively constitute a theoretical basis needed for a comprehensive validation strategy that is capable of dealing with dynamic distributed real-time systems.  相似文献   

14.
在机群系统中结点分配策略根据一定的原则为作业确定运行结点是提高系统性能的关键。通过对机群结点分配策略的研究,作者发现当前基于负载平衡自适应的结点分配策略为并行作业选择负载最轻的结点,这不利于系统性能的充分发挥。作者提出了一种新的自适应负载平衡结点分配算法:受限负载平衡结点分配。  相似文献   

15.
一种改进的基于动态反馈的负载均衡算法   总被引:12,自引:0,他引:12  
负载均衡是集群系统研究的一个重要问题,负载均衡算法是集群任务分配的核心,介绍了LVS中的负载均衡算法,讨论了常用算法的不足,在分析这些算法各自优缺点的基础上,提出了一种改进的基于反馈的负载均衡算法,算法引入一个负载容余参数以更准确地描述集群节点的负载状况,在考虑服务节点真实负载,处理能力的基础上,尽量简化负载均衡器的任务分配算法.测试结果显示该算法优于静态算法.  相似文献   

16.
网格环境由于其可扩展性、异构性以及大量的传输延迟,使得网格环境下的负载均衡不同于传统的分布式系统。提出了一种动态的分布式负载均衡算法,该算法综合考虑网格站点的处理能力和站点之间的传输延迟,采用即时分配策略来降低作业的执行成本,目标是使系统平均作业响应时间最小化。仿真结果显示该算法显著减少了作业的平均响应时间。  相似文献   

17.
Load balancing is a crucial factor in IPTV delivery networks. Load balancing aims at utilizing the resources efficiently, maximizing the throughput, and minimizing the request rejection rate. The peer-service area is the recent architecture for IPTV delivery networks that overcomes the flaws of the previous architectures. However, it still suffers from the load imbalance problem. This paper investigates the load imbalance problem, and tries to augment the peer-service area architecture to overcome this problem. To achieve the load balancing over the proposed architecture, we suggest a new load-balancing algorithm that considers both the expected and the current load of both contents and servers. The proposed load-balancing algorithm consists of two stages. The first stage is the contents replication according to their expected load, while the second stage is the content-aware request distribution. To test the effectiveness of the proposed algorithm, we have compared it with both the traditional Round Robin algorithm and Cho algorithm. The experimental results depict that the proposed algorithm outperforms the two other algorithms in terms of load balance, throughput, and request rejection rate.  相似文献   

18.
The paper presents a new approach that uses neural networks to predict the performance of a number of dynamic decentralized load-balancing strategies. A distributed multicomputer system using distributed load-balancing strategies is represented by a unified analytical queuing model. A large simulation data set is used to train a neural network using the back-propagation learning algorithm based on gradient descent The performance model using the predicted data from the neural network produces the average response time of various load balancing algorithms under various system parameters. The validation and comparison with simulation data show that the neural network is very effective in predicting the performance of dynamic load-balancing algorithms. Our work leads to interesting techniques for designing load balancing schemes (for large distributed systems) that are computationally very expensive to simulate. One of the important findings is that performance is affected least by the number of nodes, and most by the number of links at each node in a large distributed system.  相似文献   

19.
The exponential demands for high performance web servers led to use of cluster-based web servers. This increasing trend continues as dynamic contents are changing traditional web environments. Increasing utilization of cluster web servers through effective and fair load balancing is a crucial task specifically when it comes to advent of dynamic contents and database-driven applications on the internet. The proposed load-balancing algorithm classifies requests into different classes. The algorithm dynamically selects a request from a class and assigns the request to a server. For both the scheduling and dispatching, new probabilistic algorithms are proposed. To avoid using unreliable measured utilization in the face of fluctuating loads the proposed load-balancing algorithm benefits from a queuing model to predict the utilization of each server. We also used a control loop feedback to adjust the predicted values periodically based on soft computing techniques. The implementation results, using standard benchmarks confirms the effectiveness of proposed load-balancing algorithm. The algorithm significantly improves both the throughput and mean response time in contrast to two existing load-balancing algorithms.  相似文献   

20.
Load balancing has been a key concern for traditional multiprocessor systems. The emergence of computational grids extends this challenge to deal with more serious problems, such as scalability, heterogeneity of computing resources and considerable transfer delay. In this paper, we present a dynamic and decentralized load balancing algorithm for computationally intensive jobs on a heterogeneous distributed computing platform. The time spent by a job in the system is considered as the main issue that needs to be minimized. Our main contributions are: (1) Our algorithm uses site desirability for processing power and transfer delay to guide load assignment and redistribution, (2) Our transfer and location policies are a combination of two specific strategies that are performance driven to minimize execution cost. These two policies are the Instantaneous Distribution Policy (IDP) and the Load Adjustment Policy (LAP), (3) The communication overhead involved in information collection is reduced using mutual information feedback. The simulation results show that our proposed algorithm outperforms conventional approaches over a wide range of system parameters.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号