期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

周向东林澜陈国勋施伯乐《小型微型计算机系统》2003,24(4):643-647

本文对具有高通讯延迟的多处理机系统（机群系统）上的任务调度算法进行了研究，与以往算法主要考虑任务图的关键路径不同，本文给出了任务图的调度与其偶图匹配的对应关系，并由此提出了一种新的启发式算法，通过模拟试验显示本算法具有较好的调度效果。相似文献

2.

Deadlock free routing algorithms for irregular mesh topology NoC systems with rectangular regions

Rickard Maurizio Shashi 《Journal of Systems Architecture》2008,54(3-4):427-440

The simplicity of regular mesh topology Network on Chip (NoC) architecture leads to reductions in design time and manufacturing cost. A weakness of the regular shaped architecture is its inability to efficiently support cores of different sizes. A proposed way in literature to deal with this is to utilize the region concept, which helps to accommodate cores larger than the tile size in mesh topology NoC architectures. Region concept offers many new opportunities for NoC design, as well as provides new design issues and challenges. One of the most important among these is the design of an efficient deadlock free routing algorithm. Available adaptive routing algorithms developed for regular mesh topology cannot ensure freedom from deadlocks. In this paper, we list and discuss many new design issues which need to be handled for designing NoC systems incorporating cores larger than the tile size. We also present and compare two deadlock free routing algorithms for mesh topology NoC with regions. The idea of the first algorithm is borrowed from the area of fault tolerant networks, where a network topology is rendered irregular due to faults in routers or links, and is adapted for the new context. We compare this with an algorithm designed using a methodology for design of application specific routing algorithms for communication networks. The application specific routing algorithm tries to maximize adaptivity by using static and dynamic communication requirements of the application. Our study shows that the application specific routing algorithm not only provides much higher adaptivity, but also superior performance as compared to the other algorithm in all traffic cases. But this higher performance for the second algorithm comes at a higher area cost for implementing network routers. 相似文献

3.

Optimal task scheduling algorithm for cyclic synchronous tasks in general multiprocessor networks

《Journal of Parallel and Distributed Computing》2005,65(3):261-274

We develop an optimal task allocation and scheduling algorithm which minimizes the computing period for multiprocessor systems with general network structures considering task execution time and communication contentions and routing delays explicitly. We presented new ideas of scheduling: (i) individual start allowing overlapping two different iterations, (ii) the scheduling space and the scheduling graph representing feasible schedules, and (iii) the check-and-diffusion algorithm utilizing property of the start-time difference vs. the computing period. With concrete examples of scheduling spaces, segments, and schedules for various multiprocessor network architectures, we showed that individual start reduces the computing period, and our algorithm can find the optimal computing period without exhaustive search. 相似文献

4.

Scheduling precedence constrained task graphs with non-negligibleintertask communication onto multiprocessors

Selvakumar S. Siva Ram Murthy C. 《Parallel and Distributed Systems, IEEE Transactions on》1994,5(3):328-336

The multiprocessor scheduling problem is the problem of scheduling the tasks of a precedence constrained task graph (representing a parallel program) onto the processors of a multiprocessor in a way that minimizes the completion time. Since this problem is known to be NP-hard in the strong sense in all but a few very restricted eases, heuristic algorithms are being developed which obtain near optimal schedules in a reasonable amount of computation time. We present an efficient heuristic algorithm for scheduling precedence constrained task graphs with nonnegligible intertask communication onto multiprocessors taking contention in the communication channels into consideration. Our algorithm for obtaining satisfactory suboptimal schedules is based on the classical list scheduling strategy. It simultaneously exploits the schedule-holes generated in the processors and in the communication channels during the scheduling process in order to produce better schedules. We demonstrate the effectiveness of our algorithm by comparing with two competing heuristic algorithms available in the literature 相似文献

5.

A genetic algorithm for multiprocessor scheduling 总被引：6，自引：0，他引：6

Hou E.S.H. Ansari N. Hong Ren 《Parallel and Distributed Systems, IEEE Transactions on》1994,5(2):113-120

The problem of multiprocessor scheduling can be stated as finding a schedule for a general task graph to be executed on a multiprocessor system so that the schedule length can be minimized. This scheduling problem is known to be NP-hard, and methods based on heuristic search have been proposed to obtain optimal and suboptimal solutions. Genetic algorithms have recently received much attention as a class of robust stochastic search algorithms for various optimization problems. In this paper, an efficient method based on genetic algorithms is developed to solve the multiprocessor scheduling problem. The representation of the search node is based on the order of the tasks being executed in each individual processor. The genetic operator proposed is based on the precedence relations between the tasks in the task graph. Simulation results comparing the proposed genetic algorithm, the list scheduling algorithm, and the optimal schedule using random task graphs, and a robot inverse dynamics computational task graph are presented 相似文献

6.

A deadlock-free routing algorithm for dynamically reconfigurable Networks-on-Chip

Chris Jackson^{Author Vitae} Simon J. Hollis Author Vitae 《Microprocessors and Microsystems》2011,35(2):139-151

We address routing in Networks-On-Chip (NoC) architectures that use irregular mesh topologies with Long-Range Links (LRL). These topologies create difficult conditions for routing algorithms, as standard algorithms assume a static, regular link structure and exploit the uniformity of regular meshes to avoid deadlock and maintain routability. We present a novel routing algorithm that can cope with these irregular topologies and adapt to run-time LRL insertion and topology reconfiguration. Our approach to accommodate dynamic topology reconfiguration is to use a new technique that decomposes routing relations into two stages: the calculation of output ports on the current minimal path and the application of routing restrictions designed to prevent deadlock. In addition, we present a selection function that uses local topology data to adaptively select optimal paths.The routing algorithm is shown to be deadlock-free, after which an analysis of all possible routing decisions in the region of an LRL is carried out. We show that the routing algorithm minimises the cost of sub-optimally placed LRL and display the hop savings available. When applied to LRLs of less than seven hops, the overall traffic hop count and associated routing energy cost is reduced. In a simulated 8 × 8 network the total input buffer usage across the network was reduced by 6.5%. 相似文献

7.

Hierarchical Scheduling for Symmetric Multiprocessors

Chandra A. Shenoy P. 《Parallel and Distributed Systems, IEEE Transactions on》2008,19(3):418-431

Hierarchical scheduling has been proposed as a scheduling technique to achieve aggregate resource partitioning among related groups of threads and applications in uniprocessor and packet scheduling environments. Existing hierarchical schedulers are not easily extensible to multiprocessor environments because 1) they do not incorporate the inherent parallelism of a multiprocessor system while resource partitioning and 2) they can result in unbounded unfairness or starvation if applied to a multiprocessor system in a naive manner. In this paper, we present hierarchical multiprocessor scheduling (H-SMP), a novel hierarchical CPU scheduling algorithm designed for a symmetric multiprocessor (SMP) platform. The novelty of this algorithm lies in its combination of space and time multiplexing to achieve the desired bandwidth partition among the nodes of the hierarchical scheduling tree. This algorithm is also characterized by its ability to incorporate existing proportional-share algorithms as auxiliary schedulers to achieve efficient hierarchical CPU partitioning. In addition, we present a generalized weight feasibility constraint that specifies the limit on the achievable CPU bandwidth partitioning in a multiprocessor hierarchical framework and propose a hierarchical weight readjustment algorithm designed to transparently satisfy this feasibility constraint. We evaluate the properties of H-SMP using hierarchical surplus fair scheduling (H-SFS), an instantiation of H-SMP that employs surplus fair scheduling (SFS) as an auxiliary algorithm. This evaluation is carried out through a simulation study that shows that H-SFS provides better fairness properties in multiprocessor environments as compared to existing algorithms and their naive extensions. 相似文献

8.

A comparison of multiprocessor task scheduling algorithms with communication costs

Reakook Hwang Mitsuo Gen Hiroshi Katayama 《Computers & Operations Research》2008

Both parallel and distributed network environment systems play a vital role in the improvement of high performance computing. Of primary concern when analyzing these systems is multiprocessor task scheduling. Therefore, this paper addresses the challenge of multiprocessor task scheduling parallel programs, represented as directed acyclic task graph (DAG), for execution on multiprocessors with communication costs. Moreover, we investigate an alternative paradigm, where genetic algorithms (GAs) have recently received much attention, which is a class of robust stochastic search algorithms for various combinatorial optimization problems. We design the new encoding mechanism with a multi-functional chromosome that uses the priority representation—the so-called priority-based multi-chromosome (PMC). PMC can efficiently represent a task schedule and assign tasks to processors. The proposed priority-based GA has show effective performance in various parallel environments for scheduling methods. 相似文献

9.

对角网格中的无死锁自适应路由算法 总被引：2，自引：0，他引：2

郑无疾陈莘萌李克清《计算机研究与发展》2000,37(6):721-725

网格是多计算机中应用广泛的互连结构,提出了一种新的互连结构－对角网格。并在这种结构上提出了一类自适应无死锁的路由算法－负优先算法,证明了此算法的无死锁性。对角网格是可平面图,其结构简单,可扩充性非常好。负优先自适应路由算法的突出优点是对硬件逻辑要求简单,无须增加虚拟通道即可达死锁和自适应。相似文献

10.

多处理器调度算法实现及其Petri网建模与仿真 总被引：1，自引：0，他引：1

王异奇刘青昆张健《计算机应用》2011,31(4):938-941

多处理器调度算法在嵌入式实时系统领域中起着关键的作用。根据多处理器的特点,提出一种实时多处理器动态分割并行调度算法SPara。该算法解决了此前多处理器算法,如Myopic、EDPF等仅依据截止期对任务调度产生的问题,实现了增加任务紧迫度限制的调度策略,以及针对执行时间长、截止期紧迫任务的有效调度方法。同时算法结合高级颜色时间Petri网理论进行建模并仿真。测试结果表明,SPara算法在处理器利用率以及调度成功率方面较Myopic等算法有较大提高。相似文献

11.

容错的分布式系统通用死锁模型检测解除算法

程欣刘宏伟董剑杨孝宗《计算机研究与发展》2007,44(5):798-805

分布式系统技术为采用低成本购建高性能系统提供了有效的途径,但是由于资源的分配与需求可能产生冲突,造成系统中发生死锁,导致系统运行陷入停滞.在不可靠的分布式系统中,故障会干扰正常的死锁检测,但现有的死锁检测算法不具有容错功能.对失效形式进行了归类,提出一个容错的死锁检测解除算法.算法建立在通用的AND-OR模型基础上,采用扩散计算和集中规约方式,不仅能够检测到死锁,而且能给出死锁环的全部成员.若死锁拓扑处于静态且为环状,算法的消息复杂度的上限为e n-1,时间复杂度为d,其中e为死锁等待图中边的个数,n和d为构成死锁环的节点的个数,分析表明算法性能等于或优于同类算法. 相似文献

12.

Sequential and parallel cellular automata-based schedulingalgorithms

Seredynski F. Zomaya A.Y. 《Parallel and Distributed Systems, IEEE Transactions on》2002,13(10):1009-1023

We present an approach to designing cellular automata-based multiprocessor scheduling algorithms in which extracting knowledge about the scheduling process occurs. We consider the simplest case when a multiprocessor system is limited to two-processors. To design cellular automata corresponding to a given program graph, we propose a generic definition of program graph neighborhood, transparent to the various kinds, sizes, and shapes of program graphs. The cellular automata-based scheduler works in two modes: learning mode and operation mode. Discovered rules are typically suitable for sequential cellular automata working as a scheduler, while the most interesting and promising feature of cellular automata are their massive parallelism. To overcome difficulties in evolving parallel cellular automata rules, we propose using coevolutionary genetic algorithm. Discovered this way, rules enable us to design effective parallel schedulers. We present a number of experimental results for both sequential and parallel scheduling algorithms discovered in the context of a cellular automata-based scheduling system 相似文献

13.

Scheduling tasks of a parallel program in two-processor systems with use of cellular automata

F. Seredy ski 《Future Generation Computer Systems》1998,14(5-6):351-364

In this paper, a cellular automaton (CA) is proposed as a tool for designing distributed scheduling algorithms for allocating parallel program tasks in multiprocessor systems. For this purpose, a program graph is considered as a CA containing elementary automata interacting locally according to some rules. In the first phase of the algorithm, effective rules for the CA are discovered by a genetic algorithm. In the second phase, the CA works as a distributed scheduler. In this phase, for any initial allocation of tasks in a multiprocessor system, the CA-based scheduler finds an allocation minimizing the total execution time of the program in a given system topology. The effectiveness of the proposed scheduling algorithm is shown for a number of program graphs scheduled in a two-processor system. 相似文献

14.

WDM optical interconnects with recirculating buffering and limited range wavelength conversion

Zhang Z. Yang Y. 《Parallel and Distributed Systems, IEEE Transactions on》2006,17(5):466-480

All-optical communication, in particular, wavelength-division-multiplexing (WDM) technique, has been proposed as a promising candidate to meet the ever-increasing demands on bandwidth from emerging bandwidth-intensive computing/networking applications. However, with current technology, the cost of optical communication, especially the cost of optical buffering and wavelength conversion, remains a major concern for such applications. In this paper, we study WDM optical interconnects that utilize low cost recirculating buffering and limited range wavelength conversion. We first consider the packet scheduling problem in this type of interconnect, and formalize the problem of maximizing throughput and minimizing packet delay as a matching problem in a bipartite graph. We give an optimal parallel algorithm for this problem that runs in O(Bk/sup 2/) time, compared to O((N+B)/sup 3/k/sup 3/) time if directly applied to existing matching algorithms for general bipartite graphs, where N is the number of input/output fibers of the interconnect, B is the number of fiber delay lines, and k is the number of wavelengths. We also consider efficient switching fabric designs for this type of interconnect. We distinguish between the switching fabric connecting the input fibers to the output fibers and the switching fabric connecting the input fibers to the delay lines and show that by adopting the idea of concentration, the cost of the latter can be reduced significantly in terms of the number of crosspoints. 相似文献

15.

Scheduling algorithms for parallel Gaussian elimination withcommunication costs

Amoura A.K. Bampis E. Konig J.-C. 《Parallel and Distributed Systems, IEEE Transactions on》1998,9(7):679-686

We consider a graph theoretical model and study a parallel implementation of the well-known Gaussian elimination method on parallel distributed memory architectures, where the communication delay for the transmission of an elementary data is higher than the computation time of an elementary instruction. We propose and analyze two low-complexity algorithms for scheduling the tasks of the parallel Gaussian elimination on an unbounded number of completely connected processors. We compare these two algorithms with a higher-complexity general-purpose scheduling algorithm, the DSC heuristic, proposed by A. Gerasoulis and T. Yang (1993) 相似文献

16.

Scheduling in and out forests in the presence of communicationdelays

Varvarigou T.A. Roychowdhury V.P. Kallath T. Lawler E. 《Parallel and Distributed Systems, IEEE Transactions on》1996,7(10):1065-1074

We consider the problem of scheduling tasks on multiprocessor architectures in the presence of communication delays. Given a set of dependent tasks, the scheduling problem is to allocate the tasks to processors such that the pre-specified precedence constraints among the tasks are obeyed and certain cost-measures (such as the computation time) are minimized. Several cases of the scheduling problem have been proven to be NP-complete. Nevertheless, there are polynomial time algorithms for interesting special cases of the general scheduling problem. Most of these results, however, do not take into consideration the delays due to message passing among processors. In this paper we study the increase in time complexity of scheduling problems due to the introduction of communication delays. In particular, we address the open problem of scheduling Out-forests (In-forests) in a multiprocessor system of m identical processors when communication delays are considered. The corresponding problem of scheduling Out-forests (In-forests) without communication delays admits an elegant polynomial time solution as presented first by Hu in 1961; however, the problem in the presence of communication delays has remained unsolved. We present here first known polynomial time algorithms for the computation of the optimal schedule when the number of available processors is given and bounded and both computation and communication delays are assumed to take one unit of time. Furthermore, we present a linear-time algorithm for computing a near-optimal schedule for unit-delay out-forests. The schedule's length exceeds the optimum by no more than (m-2) time units, where m is the number of processors. Hence for two processors the computed schedule is strictly optimum 相似文献

17.

基于任务图的并行调度MEWFM算法

刘敏娜解争龙黄素萍《计算机工程与应用》2015,51(10):67-71

随着多核处理器体系结构在计算机领域的广泛应用,如何合理地对计算任务进行调度成为人们广泛讨论的问题。目前已经有针对多处理器的任务调度算法,但是这些算法在执行时要经过多次迭代,执行效率比较低。提出一种改进的波前调度算法MEWFM,它是一种执行时间短,加速比接近处理器核数的一种算法。这种算法主要包括任务图分层,层内调度和误差下降调度三个子算法。详细分析了这些算法的特点和执行流程。实验评测表明,算法在多处理器环境下的任务调度方面具有执行速度快,性能高等优势。相似文献

18.

同构计算环境中一种快速有效的静态任务调度算法 总被引：10，自引：1，他引：9

李庆华韩建军 Abbas A.Essa 《计算机研究与发展》2005,42(1):118-125

快速有效的调度任务是多处理器计算环境中的一个关键问题．目前任务调度算法中刻画任务依赖关系最流行的模型是DAG,在以前的文献中,提出了一种新的更实际、更普遍的TTIG模型及其相应的MATE算法(基于同构计算环境)．延伸了TTIG模型,并提出基于同构系统的新的算法及两种启发式方法(GBHA1和GBHA2)．GBHA以组的形式尽量消除图中回路,因而能获得任务图的全局信息,具有更好的调度性能．在模拟实验中,将此算法与MATE和其他同构环境中基于DAG的有效调度算法,在不同测试条件下进行了比较,结果显示GBHA在性能上明显优于MATE,与基于DAG模型的调度算法比较而言,在性能方面各有千秋,但在算法时间复杂度方面具有显著的优势．相似文献

19.

A parallel computing engine for a class of time critical processes

Nabhan T.M. Zomaya A.Y. 《IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics》1997,27(5):774-786

This paper focuses on the efficient parallel implementation of systems of numerically intensive nature over loosely coupled multiprocessor architectures. These analytical models are of significant importance to many real-time systems that have to meet severe time constants. A parallel computing engine (PCE) has been developed in this work for the efficient simplification and the near optimal scheduling of numerical models over the different cooperating processors of the parallel computer. First, the analytical system is efficiently coded in its general form. The model is then simplified by using any available information (e.g., constant parameters). A task graph representing the interconnections among the different components (or equations) is generated. The graph can then be compressed to control the computation/communication requirements. The task scheduler employs a graph-based iterative scheme, based on the simulated annealing algorithm, to map the vertices of the task graph onto a Multiple-Instruction-stream Multiple-Data-stream (MIMD) type of architecture. The algorithm uses a nonanalytical cost function that properly considers the computation capability of the processors, the network topology, the communication time, and congestion possibilities. Moreover, the proposed technique is simple, flexible, and computationally viable. The efficiency of the algorithm is demonstrated by two case studies with good results. 相似文献

20.

Communication-aware task scheduling and voltage selection for total energy minimization in a multiprocessor system using Ant Colony Optimization

HyunJin Kim Author Vitae Sungho Kang^{Author Vitae} 《Information Sciences》2011,181(18):3995-4008

Energy consumption is a key parameter when highly computational tasks should be performed in a multiprocessor system. In this case, in order to reduce total energy consumption, task scheduling and low-power methodology should be combined in an efficient way. This paper proposes an algorithm for off-line communication-aware task scheduling and voltage selection using Ant Colony Optimization. The proposed algorithm minimizes total energy consumption of an application executing on a homogeneous multiprocessor system. The artificial agents explore the search space based on stochastic decision-making using global heuristic information with total energy consumption and local heuristic information with interprocessor communication volume. In search space exploration, both voltage selection and the dependencies between tasks are considered. The pheromone trails are updated by normalizing the total energy consumption. The pheromone trails represent the global heuristic information in order to utilize all entire energy consumption information from previous evaluated solutions. Experimental results show that the proposed algorithm outperforms traditional communication-aware task scheduling and task scheduling using genetic algorithms in terms of total energy consumption. 相似文献