首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
The paper concerns parallel methods for extremal optimization (EO) applied in processor load balancing in execution of distributed programs. In these methods EO algorithms detect an optimized strategy of tasks migration leading to reduction of program execution time. We use an improved EO algorithm with guided state changes (EO-GS) that provides parallel search for next solution state during solution improvement based on some knowledge of the problem. The search is based on two-step stochastic selection using two fitness functions which account for computation and communication assessment of migration targets. Based on the improved EO-GS approach we propose and evaluate several versions of the parallelization methods of EO algorithms in the context of processor load balancing. Some of them use the crossover operation known in genetic algorithms. The quality of the proposed algorithms is evaluated by experiments with simulated load balancing in execution of distributed programs represented as macro data flow graphs. Load balancing based on so parallelized improved EO provides better convergence of the algorithm, smaller number of task migrations to be done and reduced execution time of applications.  相似文献   

2.
In this paper, we study parallel branch and bound on fine grained hypercube multiprocessors. Each processor in a fine grained system has only a very small amount of memory available. Therefore, current parallel branch and bound methods for coarse grained systems ( 1000 nodes) cannot be applied, since all these methods assume that every processor stores the path from the node it is currently processing back to the node where the process was created (the back-up path). Furthermore, the much larger number of processors available in a fine grained system makes it imperative that global information (e.g. the current best solution) is continuously available at every processor; otherwise the amount of unnecessary search would become intolerable. We describe an efficient branch-and-bound algorithm for fine grained hypercube multiprocessors. Our method uses a global scheme where all processors collectively store all back-up paths such that each processor needs to store only a constant amount of information. At each iteration of the algorithm, all current nodes may decide whether they need to create new children, be pruned, or remain unchanged. We describe an algorithm that, based on these decisions, updates the current back-up paths and distributes global information in O(log m) steps, where m is the current number of nodes. This method also includes dynamic allocation of search processes to processors and provides optimal load balancing. Even if very drastic changes in the set of current nodes occur, our load balancing mechanism does not suffer any slow down.  相似文献   

3.
This paper diverges from the traditional load balancing, and introduces a new principle called the on-machine load balance rule. The on-machine load balance rule leads to resource allocations that are better in tolerating uncertainties in the processing times of the tasks allocated to the resources when compared to other resource allocations that are derived using the conventional “across-the-machines” load balancing rule. The on-machine load balance rule calls for the resource allocation algorithms to allocate similarly sized tasks on a machine (in addition to optimizing some primary performance measures such as estimated makespan and average response time). The on-machine load balance rule is very different from the usual across-the-machines load balance rule that strives to balance load across resources so that all resources have similar finishing times.We give a mathematical justification for the on-machine load balance rule requiring only liberal assumptions about task processing times. Then we validate with extensive simulations that the resource allocations derived using on-machine load balance rule are indeed more tolerant of uncertain task processing times.  相似文献   

4.
The DLB (Dynamic Load Balancing) library and LeWI (LEnd When Idle) algorithm provide a runtime solution to deal with the load imbalance of parallel applications independently of the source of imbalance. DLB relies on the usage of hybrid programming models and exploits the malleability of the second level of parallelism to redistribute computation power across processes.  相似文献   

5.
In this paper, we propose a new method to compute lower bounds for curriculum-based course timetabling (CTT), which calls for the best weekly assignment of university course lectures to rooms and time slots. The lower bound is obtained by splitting the objective function into two parts, considering one separate problem for each part of the objective function, and summing up the corresponding optimal values (or, in some cases, lower bounds on these values), found by formulating the two parts as Integer Linear Programs (ILPs). The solution of one ILP is obtained by using a column generation procedure. Experimental results show that the proposed lower bound is often better than the ones found by the previous methods in the literature, and also much better than those found by other new ILP formulations illustrated in this paper. The proposed approach is able to obtain improved lower bounds on real-world benchmark instances from the literature, used in the international timetabling competitions ITC2002 and ITC2007, proving for the first time that some of the best-known heuristic solutions are indeed optimal (or close to the optimal ones).  相似文献   

6.
We propose a new proof technique which can be used to analyse many parallel load balancing algorithms. The technique is designed to handle concurrent load balancing actions, which are often the main obstacle in the analysis. We demonstrate the usefulness of the approach by analysing various natural diffusion-type protocols. Our results are similar to, or better than, previously existing ones, while our proofs are much easier.  相似文献   

7.
The Job Scheduling with Cancellation problem is a variation of classical scheduling problems in which jobs can be cancelled while waiting for execution. In this paper we prove a tight lower bound of 5 for the competitive ratio of any deterministic online algorithm for this problem, for the case where all jobs have the same processing time.  相似文献   

8.
Due to the emergence of grid computing over the Internet, there is a need for a hybrid load balancing algorithm which takes into account the various characteristics of the grid computing environment. Hence, this research proposes a fault tolerant hybrid load balancing strategy namely AlgHybrid_LB, which takes into account grid architecture, computer heterogeneity, communication delay, network bandwidth, resource availability, resource unpredictability and job characteristics. AlgHybrid_LB juxtaposes the strong points of neighbor-based and cluster based load balancing algorithms. Our main objective is to arrive at job assignments that could achieve minimum response time and optimal computing node utilization. Major achievements include low complexity of proposed approach and drastic reduction of number of additional communications induced due to load balancing. A simulation of the proposed approach using Grid Simulation Toolkit (GridSim) is conducted. Experimental results show that the proposed algorithm performs very well in a large grid environment.  相似文献   

9.
The prevalence of dynamic-content web services, exemplified by search and online social networking, has motivated an increasingly wide web-facing front end. Horizontal scaling in the Cloud is favored for its elasticity, and distributed design of load balancers is highly desirable. Existing algorithms with a centralized design, such as Join-the-Shortest-Queue (JSQ), incur high communication overhead for distributed dispatchers.We propose a novel class of algorithms called Join-Idle-Queue (JIQ) for distributed load balancing in large systems. Unlike algorithms such as Power-of-Two, the JIQ algorithm incurs no communication overhead between the dispatchers and processors at job arrivals. We analyze the JIQ algorithm in the large system limit and find that it effectively results in a reduced system load, which produces 30-fold reduction in queueing overhead compared to Power-of-Two at medium to high load. An extension of the basic JIQ algorithm deals with very high loads using only local information of server load.  相似文献   

10.
With the development of large scale multiagent systems, agents are always organized in network structures where each agent interacts only with its immediate neighbors in the network. Coordination among networked agents is a critical issue which mainly includes two aspects: task allocation and load balancing; in traditional approach, the resources of agents are crucial to their abilities to get tasks, which is called talent-based allocation. However, in networked multiagent systems, the tasks may spend so much communication costs among agents that are sensitive to the agent localities; thus this paper presents a novel idea for task allocation and load balancing in networked multiagent systems, which takes into account both the talents and centralities of agents. This paper first investigates the comparison between talent-based task allocation and centrality-based one; then, it explores the load balancing of such two approaches in task allocation. The experiment results show that the centrality-based method can reduce the communication costs for single task more effectively than the talent-based one, but the talent-based method can generally obtain better load balancing performance for parallel tasks than the centrality-based one.  相似文献   

11.
The paper gives a new proof of the well-known lower bound (n logn) for the algebraic complexity of the functionx 1 n x 2 n +...+x n n (if the characteristic of the ground field does not dividen). As a tool, the proof uses a computational model that counts only duplicators.  相似文献   

12.
In this paper, we present a topology-aware load balancing algorithm for parallel multi-core machines and its proof of asymptotic convergence to an optimal solution. The algorithm, named HwTopoLB, aims to improve the application performance by reducing core idleness and communication delays. HwTopoLB was designed taking into account the properties of current parallel systems composed of multi-core compute nodes, namely their network interconnection, and their complex and hierarchical core topology. The latter comprises multiple levels of cache, and a memory subsystem with NUMA design. These systems provide high processing power at the expense of asymmetric communication costs, which can hamper the performance of parallel applications depending on their communication patterns if ignored. Our load balancing algorithm models asymmetries in terms of latencies and bandwidths, representing the distances and communication costs among hardware components. We have implemented HwTopoLB using the Charm++ Parallel Runtime System and evaluated its performance with two different benchmarks and one application. Our experimental results with HwTopoLB exhibit scalability over clustered multi-core compute nodes, and average performance improvements of 23% over execution without load balancers and 19% over the existing load balancing strategies on different multi-core systems.  相似文献   

13.
In many data-centric storage techniques, each event corresponds to a hashing location by event type. However, most of them fail to deal with storage memory space due to high percentage of the load is assigned to a relatively small portion of the sensor nodes. Hence, these nodes may fail to deal with the storage of the sensor nodes effectively. To solve the problem, we propose a grid-based dynamic load balancing approach for data-centric storage in sensor networks that relies on two schemes: (1) a cover-up scheme to deal with a problem of a storage node whose memory space is depleted. This scheme can adjust the number of storage nodes dynamically; (2) the multi-threshold levels to achieve load balancing in each grid and all nodes get load balancing. Simulations have shown that our scheme can enhance the quality of data and avoid hotspot of the storage while there are a vast number of the events in a sensor network.  相似文献   

14.
SALSA: QoS-aware load balancing for autonomous service brokering   总被引:1,自引:0,他引:1  
The evolution towards “Software as a Service”, facilitated by various web service technologies, has led to applications composed of a number of service building blocks. These applications are dynamically composed by web service brokers, but rely critically on proper functioning of each of the composing subparts which is not entirely under control of the applications themselves. The problem at hand for the provider of the service is to guarantee non-functional requirements such as service access and performance to each customer. To this end, the service provider typically divides the load of incoming service requests across the available server infrastructure. In this paper we describe an adaptive load balancing strategy called SALSA (Simulated Annealing Load Spreading Algorithm), which is able to guarantee for different customer priorities, such as default and premium customers, that the services are handled in a given time and this without the need to adapt the servers executing the service logic themselves. It will be shown that by using SALSA, web service brokers are able to autonomously meet SLAs, without a priori over-dimensioning resources. This will be done by taking into account a real time view of the requests by measuring the Poisson arrival rates at that moment and selectively drop some requests from default customers. This way the web servers’ load is reduced in order to guarantee the service time for premium customers and provide best effort to default customers. We compared the results of SALSA with weighted round-robin (WRR), nowadays the most used load balancing strategy, and it was shown that the SALSA algorithm requires slightly more processing than WRR but is able to offer guarantees - contrary to WRR - by dynamically adapting its load balancing strategy.  相似文献   

15.
While Graphics Processing Units (GPUs) show high performance for problems with regular structures, they do not perform well for irregular tasks due to the mismatches between irregular problem structures and SIMD-like GPU architectures. In this paper, we introduce a new library, CUIRRE, for improving performance of irregular applications on GPUs. CUIRRE reduces the load imbalance of GPU threads resulting from irregular loop structures. In addition, CUIRRE can characterize irregular applications for their irregularity, thread granularity and GPU utilization. We employ this library to characterize and optimize both synthetic and real-world applications. The experimental results show that a 1.63× on average and up to 2.76× performance improvement can be achieved with the centralized task pool approach in the library at a 4.57% average overhead with static loading ratios. To avoid the cost of exhaustive searches of loading ratios, an adaptive loading ratio method is proposed to derive appropriate loading ratios for different inputs automatically at runtime. Our task pool approach outperforms other load balancing schemes such as the task stealing method and the persistent threads method. The CUIRRE library can easily be applied on many other irregular problems.  相似文献   

16.
A repartitioning hypergraph model for dynamic load balancing   总被引:1,自引:0,他引:1  
In parallel adaptive applications, the computational structure of the applications changes over time, leading to load imbalances even though the initial load distributions were balanced. To restore balance and to keep communication volume low in further iterations of the applications, dynamic load balancing (repartitioning) of the changed computational structure is required. Repartitioning differs from static load balancing (partitioning) due to the additional requirement of minimizing migration cost to move data from an existing partition to a new partition. In this paper, we present a novel repartitioning hypergraph model for dynamic load balancing that accounts for both communication volume in the application and migration cost to move data, in order to minimize the overall cost. The use of a hypergraph-based model allows us to accurately model communication costs rather than approximate them with graph-based models. We show that the new model can be realized using hypergraph partitioning with fixed vertices and describe our parallel multilevel implementation within the Zoltan load balancing toolkit. To the best of our knowledge, this is the first implementation for dynamic load balancing based on hypergraph partitioning. To demonstrate the effectiveness of our approach, we conducted experiments on a Linux cluster with 1024 processors. The results show that, in terms of reducing total cost, our new model compares favorably to the graph-based dynamic load balancing approaches, and multilevel approaches improve the repartitioning quality significantly.  相似文献   

17.
We address the problem of porting parallel distributed applications from static homogeneous cluster environments to dynamic heterogeneous Grid resources. We introduce a generic technique for adaptive load balancing of parallel applications on heterogeneous resources and evaluate it using a case study application: a Virtual Reactor for simulation of plasma chemical vapour deposition. This application has a modular architecture with a number of loosely coupled components suitable for distribution over the Grid. It requires large parameter space exploration that allows using Grid resources for high-throughput computing. The Virtual Reactor contains a number of parallel solvers originally designed for homogeneous computer clusters that needed adaptation to the heterogeneity of the Grid. In this paper we study the performance of one of the parallel solvers, apply the technique developed for adaptive load balancing, evaluate the efficiency of this approach and outline an automated procedure for optimal utilization of heterogeneous Grid resources for high-performance parallel computing.  相似文献   

18.
A parallel ray tracing algorithm is presented. It subdivides the seene into 3D regions, the adjacency of which is modelled by a connectivity graph of regions. Since with each region is associated a ray tracing process, this graph becomes a graph of processes, the edges of which represent the communications between processes. This graph of processes is suitably mapped onto a hypercube topology so as to minimize the communication cost. Static load balancing is performed and solutions are brought to the problems of network congestion and termination.This work has been supported byC 3 and by the CCETT (Centre Commun d'Etudes de Télédiffusion et Télécommunications) under contract 86ME46  相似文献   

19.
In this paper, we investigate the problem of failure tolerated multicast requests in survivable networks and propose a new heuristic algorithm called segment protection with load balancing (SPLB) to address the single-link failure. In order to obtain better performances, in SPLB first we consider the techniques of cross-sharing and self-sharing to improve the resource utilization ratio, second we propose a segment protection routing algorithm to overcome the trap problem, and third we design a load balancing method to reduce the blocking probability. Compared with conventional algorithm, SPLB performs better performances. Simulation results meet our expectation.  相似文献   

20.
In the event that big-sized complex products (containing a large number of assembly tasks most of which have long task times) are produced in simple or two-sided assembly lines, hundreds of stations are essentially required. Long product flow time, a large area for establishment of the line, a high budget for the investment of equipment, and tools in stations and several work-in-process are also required for these kinds of products. In order to avoid these disadvantages, assembly lines with parallel multi-manned workstations can be utilized. In this paper, these lines and one of their balancing problems are addressed, and a branch and bound algorithm is proposed. The algorithm is composed of a branching scheme, some efficient dominance and feasibility criteria based on a problem-specific knowledge. A heuristic-based guidance for enumeration process is included as an efficient component of the algorithm as well. VWSolver algorithm proposed for a special version of the problem in the literature has been modified and compared with the proposed algorithm. Results show that proposed algorithm outperforms VWSolver in terms of both CPU times and quality of feasible solutions found.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号