首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
本文对具有高通讯延迟的多处理机系统(机群系统)上的任务调度算法进行了研究,与以往算法主要考虑任务图的关键路径不同,本文给出了任务图的调度与其偶图匹配的对应关系,并由此提出了一种新的启发式算法,通过模拟试验显示本算法具有较好的调度效果。  相似文献   

2.
The simplicity of regular mesh topology Network on Chip (NoC) architecture leads to reductions in design time and manufacturing cost. A weakness of the regular shaped architecture is its inability to efficiently support cores of different sizes. A proposed way in literature to deal with this is to utilize the region concept, which helps to accommodate cores larger than the tile size in mesh topology NoC architectures. Region concept offers many new opportunities for NoC design, as well as provides new design issues and challenges. One of the most important among these is the design of an efficient deadlock free routing algorithm. Available adaptive routing algorithms developed for regular mesh topology cannot ensure freedom from deadlocks. In this paper, we list and discuss many new design issues which need to be handled for designing NoC systems incorporating cores larger than the tile size. We also present and compare two deadlock free routing algorithms for mesh topology NoC with regions. The idea of the first algorithm is borrowed from the area of fault tolerant networks, where a network topology is rendered irregular due to faults in routers or links, and is adapted for the new context. We compare this with an algorithm designed using a methodology for design of application specific routing algorithms for communication networks. The application specific routing algorithm tries to maximize adaptivity by using static and dynamic communication requirements of the application. Our study shows that the application specific routing algorithm not only provides much higher adaptivity, but also superior performance as compared to the other algorithm in all traffic cases. But this higher performance for the second algorithm comes at a higher area cost for implementing network routers.  相似文献   

3.
We develop an optimal task allocation and scheduling algorithm which minimizes the computing period for multiprocessor systems with general network structures considering task execution time and communication contentions and routing delays explicitly. We presented new ideas of scheduling: (i) individual start allowing overlapping two different iterations, (ii) the scheduling space and the scheduling graph representing feasible schedules, and (iii) the check-and-diffusion algorithm utilizing property of the start-time difference vs. the computing period. With concrete examples of scheduling spaces, segments, and schedules for various multiprocessor network architectures, we showed that individual start reduces the computing period, and our algorithm can find the optimal computing period without exhaustive search.  相似文献   

4.
The multiprocessor scheduling problem is the problem of scheduling the tasks of a precedence constrained task graph (representing a parallel program) onto the processors of a multiprocessor in a way that minimizes the completion time. Since this problem is known to be NP-hard in the strong sense in all but a few very restricted eases, heuristic algorithms are being developed which obtain near optimal schedules in a reasonable amount of computation time. We present an efficient heuristic algorithm for scheduling precedence constrained task graphs with nonnegligible intertask communication onto multiprocessors taking contention in the communication channels into consideration. Our algorithm for obtaining satisfactory suboptimal schedules is based on the classical list scheduling strategy. It simultaneously exploits the schedule-holes generated in the processors and in the communication channels during the scheduling process in order to produce better schedules. We demonstrate the effectiveness of our algorithm by comparing with two competing heuristic algorithms available in the literature  相似文献   

5.
A genetic algorithm for multiprocessor scheduling   总被引:6,自引:0,他引:6  
The problem of multiprocessor scheduling can be stated as finding a schedule for a general task graph to be executed on a multiprocessor system so that the schedule length can be minimized. This scheduling problem is known to be NP-hard, and methods based on heuristic search have been proposed to obtain optimal and suboptimal solutions. Genetic algorithms have recently received much attention as a class of robust stochastic search algorithms for various optimization problems. In this paper, an efficient method based on genetic algorithms is developed to solve the multiprocessor scheduling problem. The representation of the search node is based on the order of the tasks being executed in each individual processor. The genetic operator proposed is based on the precedence relations between the tasks in the task graph. Simulation results comparing the proposed genetic algorithm, the list scheduling algorithm, and the optimal schedule using random task graphs, and a robot inverse dynamics computational task graph are presented  相似文献   

6.
We address routing in Networks-On-Chip (NoC) architectures that use irregular mesh topologies with Long-Range Links (LRL). These topologies create difficult conditions for routing algorithms, as standard algorithms assume a static, regular link structure and exploit the uniformity of regular meshes to avoid deadlock and maintain routability. We present a novel routing algorithm that can cope with these irregular topologies and adapt to run-time LRL insertion and topology reconfiguration. Our approach to accommodate dynamic topology reconfiguration is to use a new technique that decomposes routing relations into two stages: the calculation of output ports on the current minimal path and the application of routing restrictions designed to prevent deadlock. In addition, we present a selection function that uses local topology data to adaptively select optimal paths.The routing algorithm is shown to be deadlock-free, after which an analysis of all possible routing decisions in the region of an LRL is carried out. We show that the routing algorithm minimises the cost of sub-optimally placed LRL and display the hop savings available. When applied to LRLs of less than seven hops, the overall traffic hop count and associated routing energy cost is reduced. In a simulated 8 × 8 network the total input buffer usage across the network was reduced by 6.5%.  相似文献   

7.
Hierarchical scheduling has been proposed as a scheduling technique to achieve aggregate resource partitioning among related groups of threads and applications in uniprocessor and packet scheduling environments. Existing hierarchical schedulers are not easily extensible to multiprocessor environments because 1) they do not incorporate the inherent parallelism of a multiprocessor system while resource partitioning and 2) they can result in unbounded unfairness or starvation if applied to a multiprocessor system in a naive manner. In this paper, we present hierarchical multiprocessor scheduling (H-SMP), a novel hierarchical CPU scheduling algorithm designed for a symmetric multiprocessor (SMP) platform. The novelty of this algorithm lies in its combination of space and time multiplexing to achieve the desired bandwidth partition among the nodes of the hierarchical scheduling tree. This algorithm is also characterized by its ability to incorporate existing proportional-share algorithms as auxiliary schedulers to achieve efficient hierarchical CPU partitioning. In addition, we present a generalized weight feasibility constraint that specifies the limit on the achievable CPU bandwidth partitioning in a multiprocessor hierarchical framework and propose a hierarchical weight readjustment algorithm designed to transparently satisfy this feasibility constraint. We evaluate the properties of H-SMP using hierarchical surplus fair scheduling (H-SFS), an instantiation of H-SMP that employs surplus fair scheduling (SFS) as an auxiliary algorithm. This evaluation is carried out through a simulation study that shows that H-SFS provides better fairness properties in multiprocessor environments as compared to existing algorithms and their naive extensions.  相似文献   

8.
Both parallel and distributed network environment systems play a vital role in the improvement of high performance computing. Of primary concern when analyzing these systems is multiprocessor task scheduling. Therefore, this paper addresses the challenge of multiprocessor task scheduling parallel programs, represented as directed acyclic task graph (DAG), for execution on multiprocessors with communication costs. Moreover, we investigate an alternative paradigm, where genetic algorithms (GAs) have recently received much attention, which is a class of robust stochastic search algorithms for various combinatorial optimization problems. We design the new encoding mechanism with a multi-functional chromosome that uses the priority representation—the so-called priority-based multi-chromosome (PMC). PMC can efficiently represent a task schedule and assign tasks to processors. The proposed priority-based GA has show effective performance in various parallel environments for scheduling methods.  相似文献   

9.
对角网格中的无死锁自适应路由算法   总被引:2,自引:0,他引:2  
网格是多计算机中应用广泛的互连结构,提出了一种新的互连结构-对角网格。并在这种结构上提出了一类自适应无死锁的路由算法-负优先算法,证明了此算法的无死锁性。对角网格是可平面图,其结构简单,可扩充性非常好。负优先自适应路由算法的突出优点是对硬件逻辑要求简单,无须增加虚拟通道即可达 死锁和自适应。  相似文献   

10.
多处理器调度算法实现及其Petri网建模与仿真   总被引:1,自引:0,他引:1  
多处理器调度算法在嵌入式实时系统领域中起着关键的作用。根据多处理器的特点,提出一种实时多处理器动态分割并行调度算法SPara。该算法解决了此前多处理器算法,如Myopic、EDPF等仅依据截止期对任务调度产生的问题,实现了增加任务紧迫度限制的调度策略,以及针对执行时间长、截止期紧迫任务的有效调度方法。同时算法结合高级颜色时间Petri网理论进行建模并仿真。测试结果表明,SPara算法在处理器利用率以及调度成功率方面较Myopic等算法有较大提高。  相似文献   

11.
分布式系统技术为采用低成本购建高性能系统提供了有效的途径,但是由于资源的分配与需求可能产生冲突,造成系统中发生死锁,导致系统运行陷入停滞.在不可靠的分布式系统中,故障会干扰正常的死锁检测,但现有的死锁检测算法不具有容错功能.对失效形式进行了归类,提出一个容错的死锁检测解除算法.算法建立在通用的AND-OR模型基础上,采用扩散计算和集中规约方式,不仅能够检测到死锁,而且能给出死锁环的全部成员.若死锁拓扑处于静态且为环状,算法的消息复杂度的上限为e n-1,时间复杂度为d,其中e为死锁等待图中边的个数,n和d为构成死锁环的节点的个数,分析表明算法性能等于或优于同类算法.  相似文献   

12.
We present an approach to designing cellular automata-based multiprocessor scheduling algorithms in which extracting knowledge about the scheduling process occurs. We consider the simplest case when a multiprocessor system is limited to two-processors. To design cellular automata corresponding to a given program graph, we propose a generic definition of program graph neighborhood, transparent to the various kinds, sizes, and shapes of program graphs. The cellular automata-based scheduler works in two modes: learning mode and operation mode. Discovered rules are typically suitable for sequential cellular automata working as a scheduler, while the most interesting and promising feature of cellular automata are their massive parallelism. To overcome difficulties in evolving parallel cellular automata rules, we propose using coevolutionary genetic algorithm. Discovered this way, rules enable us to design effective parallel schedulers. We present a number of experimental results for both sequential and parallel scheduling algorithms discovered in the context of a cellular automata-based scheduling system  相似文献   

13.
In this paper, a cellular automaton (CA) is proposed as a tool for designing distributed scheduling algorithms for allocating parallel program tasks in multiprocessor systems. For this purpose, a program graph is considered as a CA containing elementary automata interacting locally according to some rules. In the first phase of the algorithm, effective rules for the CA are discovered by a genetic algorithm. In the second phase, the CA works as a distributed scheduler. In this phase, for any initial allocation of tasks in a multiprocessor system, the CA-based scheduler finds an allocation minimizing the total execution time of the program in a given system topology. The effectiveness of the proposed scheduling algorithm is shown for a number of program graphs scheduled in a two-processor system.  相似文献   

14.
All-optical communication, in particular, wavelength-division-multiplexing (WDM) technique, has been proposed as a promising candidate to meet the ever-increasing demands on bandwidth from emerging bandwidth-intensive computing/networking applications. However, with current technology, the cost of optical communication, especially the cost of optical buffering and wavelength conversion, remains a major concern for such applications. In this paper, we study WDM optical interconnects that utilize low cost recirculating buffering and limited range wavelength conversion. We first consider the packet scheduling problem in this type of interconnect, and formalize the problem of maximizing throughput and minimizing packet delay as a matching problem in a bipartite graph. We give an optimal parallel algorithm for this problem that runs in O(Bk/sup 2/) time, compared to O((N+B)/sup 3/k/sup 3/) time if directly applied to existing matching algorithms for general bipartite graphs, where N is the number of input/output fibers of the interconnect, B is the number of fiber delay lines, and k is the number of wavelengths. We also consider efficient switching fabric designs for this type of interconnect. We distinguish between the switching fabric connecting the input fibers to the output fibers and the switching fabric connecting the input fibers to the delay lines and show that by adopting the idea of concentration, the cost of the latter can be reduced significantly in terms of the number of crosspoints.  相似文献   

15.
We consider a graph theoretical model and study a parallel implementation of the well-known Gaussian elimination method on parallel distributed memory architectures, where the communication delay for the transmission of an elementary data is higher than the computation time of an elementary instruction. We propose and analyze two low-complexity algorithms for scheduling the tasks of the parallel Gaussian elimination on an unbounded number of completely connected processors. We compare these two algorithms with a higher-complexity general-purpose scheduling algorithm, the DSC heuristic, proposed by A. Gerasoulis and T. Yang (1993)  相似文献   

16.
We consider the problem of scheduling tasks on multiprocessor architectures in the presence of communication delays. Given a set of dependent tasks, the scheduling problem is to allocate the tasks to processors such that the pre-specified precedence constraints among the tasks are obeyed and certain cost-measures (such as the computation time) are minimized. Several cases of the scheduling problem have been proven to be NP-complete. Nevertheless, there are polynomial time algorithms for interesting special cases of the general scheduling problem. Most of these results, however, do not take into consideration the delays due to message passing among processors. In this paper we study the increase in time complexity of scheduling problems due to the introduction of communication delays. In particular, we address the open problem of scheduling Out-forests (In-forests) in a multiprocessor system of m identical processors when communication delays are considered. The corresponding problem of scheduling Out-forests (In-forests) without communication delays admits an elegant polynomial time solution as presented first by Hu in 1961; however, the problem in the presence of communication delays has remained unsolved. We present here first known polynomial time algorithms for the computation of the optimal schedule when the number of available processors is given and bounded and both computation and communication delays are assumed to take one unit of time. Furthermore, we present a linear-time algorithm for computing a near-optimal schedule for unit-delay out-forests. The schedule's length exceeds the optimum by no more than (m-2) time units, where m is the number of processors. Hence for two processors the computed schedule is strictly optimum  相似文献   

17.
随着多核处理器体系结构在计算机领域的广泛应用,如何合理地对计算任务进行调度成为人们广泛讨论的问题。目前已经有针对多处理器的任务调度算法,但是这些算法在执行时要经过多次迭代,执行效率比较低。提出一种改进的波前调度算法MEWFM,它是一种执行时间短,加速比接近处理器核数的一种算法。这种算法主要包括任务图分层,层内调度和误差下降调度三个子算法。详细分析了这些算法的特点和执行流程。实验评测表明,算法在多处理器环境下的任务调度方面具有执行速度快,性能高等优势。  相似文献   

18.
同构计算环境中一种快速有效的静态任务调度算法   总被引:10,自引:1,他引:9  
快速有效的调度任务是多处理器计算环境中的一个关键问题.目前任务调度算法中刻画任务依赖关系最流行的模型是DAG,在以前的文献中,提出了一种新的更实际、更普遍的TTIG模型及其相应的MATE算法(基于同构计算环境).延伸了TTIG模型,并提出基于同构系统的新的算法及两种启发式方法(GBHA1和GBHA2).GBHA以组的形式尽量消除图中回路,因而能获得任务图的全局信息,具有更好的调度性能.在模拟实验中,将此算法与MATE和其他同构环境中基于DAG的有效调度算法,在不同测试条件下进行了比较,结果显示GBHA在性能上明显优于MATE,与基于DAG模型的调度算法比较而言,在性能方面各有千秋,但在算法时间复杂度方面具有显著的优势.  相似文献   

19.
This paper focuses on the efficient parallel implementation of systems of numerically intensive nature over loosely coupled multiprocessor architectures. These analytical models are of significant importance to many real-time systems that have to meet severe time constants. A parallel computing engine (PCE) has been developed in this work for the efficient simplification and the near optimal scheduling of numerical models over the different cooperating processors of the parallel computer. First, the analytical system is efficiently coded in its general form. The model is then simplified by using any available information (e.g., constant parameters). A task graph representing the interconnections among the different components (or equations) is generated. The graph can then be compressed to control the computation/communication requirements. The task scheduler employs a graph-based iterative scheme, based on the simulated annealing algorithm, to map the vertices of the task graph onto a Multiple-Instruction-stream Multiple-Data-stream (MIMD) type of architecture. The algorithm uses a nonanalytical cost function that properly considers the computation capability of the processors, the network topology, the communication time, and congestion possibilities. Moreover, the proposed technique is simple, flexible, and computationally viable. The efficiency of the algorithm is demonstrated by two case studies with good results.  相似文献   

20.
Energy consumption is a key parameter when highly computational tasks should be performed in a multiprocessor system. In this case, in order to reduce total energy consumption, task scheduling and low-power methodology should be combined in an efficient way. This paper proposes an algorithm for off-line communication-aware task scheduling and voltage selection using Ant Colony Optimization. The proposed algorithm minimizes total energy consumption of an application executing on a homogeneous multiprocessor system. The artificial agents explore the search space based on stochastic decision-making using global heuristic information with total energy consumption and local heuristic information with interprocessor communication volume. In search space exploration, both voltage selection and the dependencies between tasks are considered. The pheromone trails are updated by normalizing the total energy consumption. The pheromone trails represent the global heuristic information in order to utilize all entire energy consumption information from previous evaluated solutions. Experimental results show that the proposed algorithm outperforms traditional communication-aware task scheduling and task scheduling using genetic algorithms in terms of total energy consumption.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号