首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Optimal distribution of divisible loads in bus networks is considered in this paper. The problem of minimizing the processing time is investigated by including all the overhead components that could penalize the performance of the system, in addition to the inherent communication and computation delays. These overheads are considered to be constant additive factors to the respective communication and computation components. Closed-form solution for the processing time is derived and the influence of overheads on the optimal processing time is analyzed. We derive a necessary and sufficient condition for the existence of the optimal processing time. We then study the effect of changing the load distribution sequence on the time performance. Through rigorous analysis, an optimal sequence to distribute the load among the processors is identified, whenever it exists. In case such an optimal sequence fails to exist, we present a greedy algorithm to obtain a suboptimal sequence based on some important properties of the overhead factors. Then, the effect of granularity of the data that is divisible is considered in the analysis for the case of homogeneous networks. An integer approximation algorithm capable of generating integer values of the load fractions in time O(m), where m is the number of processors in the network, is proposed. We then show that the upper bound on the suboptimal solution generated by our algorithm lies within a radius given by the sum of the computation and communication delays. Several numerical examples are presented to illustrate the concepts  相似文献   

2.
The problem of optimal divisible load distribution in distributed bus networks employing a heterogeneous cluster of processors is addressed. The objective is to minimize the total processing time of the entire load subject to the communication and computation delays. In the mathematical model we adopt, both the granularity of the load fractions and all the associated overheads (also referred to as start-up costs) in the process of communication and computation, are considered explicitly in the problem formulation. We introduce a directed flow graph model for representing the load distribution process. This representation is novel to this literature. With this model, we first derive a closed-form solution for an optimal processing time. We propose an integer approximation algorithm and derive ultimate performance bounds for the class of homogeneous networks. We then extend the problem to a special class of application problems in which the data partitioning is restricted to a finite number of partitions. For this case, we present a recursive procedure to obtain optimal processing time. We then present two different integer approximation algorithms-PIA and IIA that could generate integer load fractions and yield suboptimal solutions. The choice of these algorithms are also analyzed. All the results are extended to a class of homogeneous networks to obtain ultimate performance bounds. Several illustrative examples are provided for ease of explanation  相似文献   

3.
神经元的映射分配是人工神经网络虚拟实现中的重要研究课题。本文系统地分析了人工神经网络的重要性质-并行分布处理,并对映射分配问题中的两个关键性概念-负载均衡和通信开销进行了深入讨论。以此为基础,提出了一系列映射算法,并对算法性能进行了分析。其中,吸收算法最大程度地开发了人工神经网络固有的并行性,是一个实时的算法。  相似文献   

4.
High performance computing requires high quality load distribution of processes of a parallel application over processors in a parallel computer at runtime such that both the maximum load and dilation are minimized. The performance of a simple randomized tree embedding algorithm that dynamically supports tree-structured parallel computations on arbitrary static networks is analyzed in this paper. The algorithm spreads newly created tree nodes to neighboring processors, which actually provides randomized dilation-1 tree embedding in static networks. We develop a linear system of equations that characterizes expected loads on all processors under the reproduction tree model, which can generate trees of arbitrary size and shape. It is shown that as the tree size becomes large, the asymptotic performance ratio of the randomized tree embedding algorithm is the ratio of the maximum processor degree to the average processor degree. This implies that the simple randomized tree embedding algorithm is able to generate high quality load distributions on virtually all static networks commonly employed in parallel and distributed computing.  相似文献   

5.
The problem of distributing and processing a divisible load in a heterogeneous linear network of processors with arbitrary processors release times is considered. A divisible load is very large in size and has computationally intensive CPU requirements. Further, it has the property that the load can be partitioned arbitrarily into any number of portions and can be scheduled onto processors independently for computation. The load is assumed to arrive at one of the farthest end processors, referred to as boundary processors, for processing. The processors in the network are assumed to have nonzero release times, i.e., the time instants from which the processors are available for processing the divisible load. Our objective is to design a load distribution strategy by taking into account the release times of the processors in such a way that the entire processing time of the load is a minimum. We consider two generic cases in which all processors have identical release times and when all processors have arbitrary release times. We adopt both the single and multiinstallment strategies proposed in the divisible load scheduling literature in our design of load distribution strategies, wherever necessary, to achieve a minimum processing time. Finally, when optimal strategies cannot be realized, we propose two heuristic strategies, one for the identical case, and the other for nonidentical release times case, respectively. Several conditions are derived to determine whether or not optimal load distribution exists and illustrative examples are provided for the ease of understanding.  相似文献   

6.
线性网络上分布式任务调度算法   总被引:1,自引:0,他引:1  
针对一种已有的分布式计算理论模型(单位长度的任务由处理器独立产生,没有全局控制,彼此通信需要花费时间),研究了在线性网络上的任务有效调度问题.通过考虑算法中任务处理时间和通信时间之间的平衡,给出了一个近似比为5.88的分布式算法,该算法无需全局信息,且处理策略简单.对该问题的近似比下界也做了研究,证明了该问题不存在近似比小于1.16的算法.  相似文献   

7.
The problem of obtaining optimal processing time in a distributed computing system consisting of (N+1) processors and N communication links, arranged in a single-level tree architecture, is considered. It is shown that optimality can be achieved through a hierarchy of steps involving optimal load distribution, load sequencing, and processor-link arrangement. Closed-form expressions for optimal processing time is derived for a general case of networks with different processor speeds and different communication link speeds. Using these closed-form expressions, the paper analytically proves a number of significant results that in earlier studies were only conjectured from computational results. In addition, it also extends these results to a more general framework. The above analysis is carried out for the cases in which the root processor may or may not be equipped with a front-end processor. Illustrative examples are given for all cases considered  相似文献   

8.
In this paper, we consider the problem of scheduling multiple divisible loads on heterogeneous linear daisy chain networks. Our objective is to design a load distribution strategy such that the total processing time of a set of loads is minimized. We assume that the set of loads are resident in one of the farthest end processors, which has a scheduler that will distribute the load to the other processors in the network. When distributing a load from the set, the distribution pattern of the previous load has to be taken into consideration to ensure that no processors are left idle and there are no collisions in the communication links. We design single and multi-installments strategies to achieve the above objective. We derive certain important conditions to determine whether an optimum solution exists. We propose two heuristic strategies when an optimum solution is unattainable. Using all the above strategies, we conduct four different simulation experiments to track the performance of strategies under several real-life situations. We conducted four different simulation experiments based on the two heuristic strategies to identify the best combination suitable for our multiple-loads distribution strategy. We also run simulations for a homogeneous system to quantify the performance under 3 different policies, that is, when the loads are (a) unsorted, (b) sorted with smallest load first (SLF) and (c) sorted with largest load first (LLF). A detailed analysis of the simulation results is presented and based on these, recommendations are made for the choice of strategies. Finally, we compare the performance of a single-load distribution strategy against the multiple-loads distribution strategy designed in this paper to quantify the exact performance gain that can be achieved. Illustrative examples are also provided for ease of understanding.  相似文献   

9.
PASM is a proposed large-scale distributed/parallel processing system which can be partitioned into independent SIMD/MIMD machines of various sizes. One design problem for systems such as PASM is task scheduling. The use of multiple FIFO queues for nonpreemptive task scheduling is described. Four multiple-queue scheduling algorithms with different placement policies are presented and applied to the PASM parallel processing system. Simulation of a queueing network model is used to compare the performance of the algorithms. Their performance is also considered in the case where there are faulty control units and processors. The multiple-queue scheduling algorithms can be adapted for inclusion in other multiple-SIMD and partitionable SIMD/MIMD systems that use similar types of interconnection networks to those being considered for PASM.  相似文献   

10.
In this paper, we propose a new load distribution strategy called ‘send-and-receive’ for scheduling divisible loads, in a linear network of processors with communication delay. This strategy is designed to optimally utilize the network resources and thereby minimizes the processing time of entire processing load. A closed-form expression for optimal size of load fractions and processing time are derived when the processing load originates at processor located in boundary and interior of the network. A condition on processor and link speed is also derived to ensure that the processors are continuously engaged in load distributions. This paper also presents a parallel implementation of ‘digital watermarking problem’ on a personal computer-based Pentium Linear Network (PLN) topology. Experiments are carried out to study the performance of the proposed strategy and results are compared with other strategies found in literature.  相似文献   

11.
A load processor is a system that has a buffer which can receive load and store it while it is waiting to be processed and has a local decision-making policy for determining if portions of its load should be sent to other load processors. A load balancing system is a set of such load processors that are connected in a network so that (i) they can sense the amount of load in the buffers of neighbouring processors and pass load to them, and (ii) so that, via local information and decisions by the individual load processors, the overall load in the entire network can be balanced. Such balancing is important to ensure that certain processors are not overloaded while others are left idle (i.e. load balancing helps avoid underutilization of processing resources). The topology of the network, delays in transporting and sensing load, types of load, and types of local load passing policies all affect the performance of the load balancing system. In this paper, we show how a variety of load balancing systems can be modelled in a discrete event system (DES) theoretic framework, and how balancing properties and performance can be characterized and analysed in a general Lyapunov stability theoretic framework.  相似文献   

12.
In this paper, we address several issues that are imperative to grid environments such as handling resource heterogeneity and sharing, communication latency, job migration from one site to other, and load balancing. We address these issues by proposing two job migration algorithms, which are MELISA (modified ELISA) and LBA (load balancing on arrival). The algorithms differ in the way load balancing is carried out and is shown to be efficient in minimizing the response time on large and small-scale heterogeneous grid environments, respectively. MELISA, which is applicable to large-scale systems (that is, interGrid), is a modified version of ELISA in which we consider the job migration cost, resource heterogeneity, and network heterogeneity when load balancing is considered. The LBA algorithm, which is applicable to small-scale systems (that is, intraGrid), performs load balancing by estimating the expected finish time of a job on buddy processors on each job arrival. Both algorithms estimate system parameters such as the job arrival rate, CPU processing rate, and load on the processor and balance the load by migrating jobs to buddy processors by taking into account the job transfer cost, resource heterogeneity, and network heterogeneity. We quantify the performance of our algorithms using several influencing parameters such as the job size, data transfer rate, status exchange period, and migration limit, and we discuss the implications of the performance and choice of our approaches.  相似文献   

13.
闫超  王光旭  刘明 《计算机工程》2011,37(3):87-89,92
对TCP/IP协议中的链路层广播、多播应用以及多处理器环境中基于共享内存的虚拟网络设备的实现进行分析。对比以太网和基于共享内存的虚拟网络条件下实现广播、组播时物理层及数据链路层的不同。分析VxWorks操作系统中基于共享内存的虚拟网络设备驱动程序对广播、组播的实现策略,并根据多龙芯2E处理器并行信号处理板的结构特点提出基于流水思想的优化策略。优化后的网络具有更好的实时性、节点负载均衡性和更高的传输效率。  相似文献   

14.
Writing parallel programs that can take advantage of non-dedicated processors is much more difficult than writing such programs for networks of dedicated processors. In a non-dedicated environment such programs must use autonomic techniques to respond to the unpredictable load fluctuations that prevail in the computational environment. In adaptive query processing (AQP), several techniques have been proposed for dynamically redistributing processor load assignments throughout a computation to take account of varying resource capabilities, but we know of no previous study that compares their performance. This paper presents a simulation-based evaluation of these autonomic parallelization techniques in a uniform environment and compares how well they improve the performance of the computation. Four published strategies are compared with a new algorithm that seeks to overcome some weaknesses identified in the existing approaches. In addition, we explore the use of techniques from online algorithms to provide a firm foundation for determining when to adapt in two of the existing algorithms. The evaluations identify situations in which each strategy may be used effectively and in which it should be avoided.  相似文献   

15.
This paper describes a software architecture designed as a support for tackling the load distribution problem when solving complex problems on concurrent processors. We have considered transputer-based MIMD multiprocessors as concurrent processors and a simulator for biologically inspired neural networks as a case study. Biologically inspired neural networks are characterized by having many thousands of neurons and synapses and topologically based connection schemes. It has been our main aim to give the user the possibility of simply defining and modifying widely differing load distribution strategies, in order to make it possible to deal with a broad range of neural network architectures and processor topologies. Furthermore we provide a real tool for hiding communication delays.  相似文献   

16.
The computer industry has evolved very rapidly from single-user computers to computer networks where users are able to share both local and remote files. Networks of microcomputers facilitate the integration of all information processing for distributed applications such as database processing and electronic mail. One management application of promising potential for computer networks is distributed simulation. Simulation analysis can be useful to essentially all problem-solving and decision-making on the job.

To implement a particular distributed application, computer communication between processors must be considered. Unlike expensive multiprocessor computers, networks of less-expensive microcomputers do not have pre-established communication paths between processors. This paper addresses how this obstacle may be overcome by using communication protocols based on the Open Systems Interconnection (OSI) reference model. Protocol services needed to support a distributed simulation environment will be identified, and their implementation through a prototype will then be investigated and evaluated.  相似文献   


17.
Many scientific applications involve grids that lack a uniform underlying structure. These applications are often also dynamic in nature in that the grid structure significantly changes between successive phases of execution. In parallel computing environments, mesh adaptation of unstructured grids through selective refinement/coarsening has proven to be an effective approach. However, achieving load balance while minimizing interprocessor communication and redistribution costs is a difficult problem. Traditional dynamic load balancers are mostly inadequate because they lack a global view of system loads across processors. In this paper, we propose a novel and general-purpose load balancer that utilizes symmetric broadcast networks (SBN) as the underlying communication topology and compare its performance with a successful global load balancing environment, called PLUM, specifically created to handle adaptive unstructured applications. Our experimental results on an IBM SP2 demonstrate that the SBN-based load balancer achieves lower redistribution costs than that under PLUM by overlapping processing and data migration  相似文献   

18.
Overload control of call processors in telecom networks is used to protect the network of call processing computers from excessive load during traffic peaks, and involves techniques of predictive control with limited local information. Here we propose a neural network algorithm, in which a group of neural controllers are trained using examples generated by a globally optimal control method. Simulations show that the neural controllers have better performance than local control algorithms in both the throughput and the response to traffic upsurges. Compared with the centralized control algorithm, the neural control significantly decreases the computational time for making decisions and can be implemented in real time.  相似文献   

19.
The underlying assumption of Divisible Load Scheduling (DLS) theory is that the processors composing the network are obedient, i.e., they do not “cheat” the scheduling algorithm. This assumption is unrealistic if the processors are owned by autonomous, self-interested organizations that have no a priori motivation for cooperation and they will manipulate the algorithm if it is beneficial to do so. In this paper, we address this issue by designing a distributed mechanism for scheduling divisible loads in tree networks, called DLS-T, which provides incentives to processors for reporting their true processing capacity and executing their assigned load at full processing capacity. We prove that the DLS-T mechanism computes the optimal allocation in an ex post Nash equilibrium. Finally, we simulate and study the mechanism under various network structures and processor parameters.  相似文献   

20.
异构系统中负载平衡扩散算法的加速方法   总被引:2,自引:0,他引:2  
金之雁  王鼎兴 《软件学报》2003,14(5):904-910
目前,很多单位与组织都有连接着数百台工作站和微机的局域网,并将它们作为一个机群系统使用.在这样的异构系统上动态负载平衡是提高性能的一个重要方法.扩散方法是同构系统的动态负载平衡算法.将散算法扩展到异构系统中,对异构系统中速度不同的处理机的位置与扩散收敛速度的关系进行了研究,提出了加速扩散算法的收敛速度的优化方法.初步实验证明,该方法能通过合理安排处理机,加快扩散算法的速度.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号