首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
On-line scheduling of scalable real-time tasks on multiprocessor systems   总被引:1,自引:0,他引:1  
The computation time of scalable tasks depends on the number of processors allocated to them in multiprocessor systems. As more processors are allocated to a scalable task, the overall computation time of the task decreases but the total amount of processors’ time devoted to the execution of the task, called workload, increases due to parallel execution overhead. In this paper, we propose a task scheduling algorithm that utilizes the property of scalable tasks for on-line and real-time scheduling. In the proposed algorithm, the total workload of all scheduled tasks is reduced by managing processors allocated to the tasks as few as possible without missing their deadlines. As a result, the processors in the system have less load to execute the scheduled tasks and can execute more newly arriving tasks before their deadlines. Simulation results show that the proposed algorithm performs significantly better than the conventional algorithm based on a fixed number of processors to execute each task.  相似文献   

2.
一种基于多处理器任务复制的分簇调度算法   总被引:2,自引:1,他引:1  
任务调度的优劣是决定并行分布式计算机系统性能好坏的重要因素之一。为优化任务调度,基于一些典型算法(如LG、PPA算法等),提出了一种新的任务调度算法。该算法一方面复制满足条件的前驱任务来缩短调度长度;另一方面合理地复制其他前驱任务和合并冗余簇来减少所需处理器的数目。实验表明,该算法在调度长度和所需处理器的数目上优于以上典型算法,并具有更小的时间复杂度,对并行计算机系统性能的提升具有一定的意义。  相似文献   

3.
In this paper, we study an online scheduling problem with moldable parallel tasks on m processors. Each moldable task can be processed simultaneously on any number of processors of a parallel computer, and the processing time of a moldable task depends on the number of processors allotted to it. Tasks arrive one by one. Upon arrival of each task, the scheduler has to determine both the number of processors and the starting time for the task. Moreover, these decisions cannot be changed in the future. The objective is to attain a schedule such that the longest completion time over all tasks, i.e., the makespan, is minimized. First, we provide a general framework to show that any \(\rho \)-bounded algorithm for scheduling of rigid parallel tasks (the number of processors for a task is fixed a prior) can be extended to yield an algorithm for scheduling of moldable tasks with a competitive ratio of \(4\rho \) if the ratio \(\rho \) is known beforehand. As a consequence, we achieve the first constant competitive ratio, 26.65, for the moldable parallel tasks scheduling problem. Next, we provide an improved algorithm with a competitive ratio of at most 16.74.  相似文献   

4.
In parallel and distributed applications, it is very likely that object‐oriented languages, such as Java and Ruby, and large‐scale semistructured data written in XML will be employed. However, because of their inherent dynamic memory management, parallel and distributed applications must sometimes suspend the execution of all tasks running on the processors. This adversely affects their execution on the parallel and distributed platform. In this paper, we propose a new task scheduling method called CP/MM (Critical Path/Memory Management) which can efficiently schedule tasks for applications requiring memory management. The underlying concept is to consider the cost due to memory management when the task scheduling system allocates ready (executable) coarse‐grain tasks, or macro‐tasks, to processors. We have developed three task scheduling modules, including CP/MM, for a task scheduling system which is implemented on a Java RMI (Remote Method Invocation) communication infrastructure. Our experimental results show that CP/MM can successfully prevent high‐priority macro‐tasks from being affected by the garbage collection arising from memory management, so that CP/MM can efficiently schedule distributed programs whose critical paths are relatively long. Copyright © 2005 John Wiley & Sons, Ltd.  相似文献   

5.
The running time and memory requirement of a parallel program with dynamic, lightweight threads depends heavily on the underlying thread scheduler. In this paper we present a new asynchronous, space-efficient scheduling algorithm for shared memory machines that combines the low scheduling overheads and good locality of work stealing with the low space requirements of depth-first schedulers. For a nested-parallel program with depth D and serial space requirement S 1 , we show that the expected space requirement is S 1 + O(K⋅ p⋅ D) on p processors. Here, K is a user-adjustable runtime parameter, which provides a tradeoff between running time and space requirement. Our algorithm achieves good locality and low scheduling overheads by automatically increasing the granularity of the work scheduled on each processor. We have implemented the new scheduling algorithm in the context of a native, user-level implementation of Posix standard threads or Pthreads, and evaluated its performance using a set of C-based benchmarks that have dynamic or irregular parallelism. We compare the performance of our scheduler with that of two previous schedulers: the thread library's original scheduler (which uses a FIFO queue), and a provably space-efficient depth-first scheduler. At a fine thread granularity, our scheduler outperforms both these previous schedulers, but requires marginally more memory than the depth-first scheduler. We also present simulation results on synthetic benchmarks to compare our scheduler with space-efficient versions of both a work-stealing scheduler and a depth-first scheduler. The results indicate that unlike these previous approaches, the new algorithm covers a range of scheduling granularities and space requirements, and allows the user to trade the space requirement of a program with the scheduling granularity. Received June 11, 2000, and in revised form March 19, 2001, and in final form August 7, 2001. Online publication April 5, 2002.  相似文献   

6.
Scheduling tasks onto the processors of a parallel system is a crucial part of program parallelisation. Due to the NP-hard nature of the task scheduling problem, scheduling algorithms are based on heuristics that try to produce good rather than optimal schedules. Nevertheless, in certain situations it is desirable to have optimal schedules, for example for time-critical systems or to evaluate scheduling heuristics. This paper investigates the task scheduling problem using the A* search algorithm which is a best-first state space search. The adaptation of the A* search algorithm for the task scheduling problem is referred to as the A* scheduling algorithm. The A* scheduling algorithm can produce optimal schedules in reasonable time for small to medium sized task graphs with several tens of nodes. In comparison to a previous approach, the here presented A* scheduling algorithm has a significantly reduced search space due to a much improved consistent and admissible cost function f(s) and additional pruning techniques. Experimental results show that the cost function and the various pruning techniques are very effective for the workload. Last but not least, the results show that the proposed A* scheduling algorithm significantly outperforms the previous approach.  相似文献   

7.
Multi-core real-time scheduling for generalized parallel task models   总被引:1,自引:0,他引:1  
Multi-core processors offer a significant performance increase over single-core processors. They have the potential to enable computation-intensive real-time applications with stringent timing constraints that cannot be met on traditional single-core processors. However, most results in traditional multiprocessor real-time scheduling are limited to sequential programming models and ignore intra-task parallelism. In this paper, we address the problem of scheduling periodic parallel tasks with implicit deadlines on multi-core processors. We first consider a synchronous task model where each task consists of segments, each segment having an arbitrary number of parallel threads that synchronize at the end of the segment. We propose a new task decomposition method that decomposes each parallel task into a set of sequential tasks. We prove that our task decomposition achieves a resource augmentation bound of 4 and 5 when the decomposed tasks are scheduled using global EDF and partitioned deadline monotonic scheduling, respectively. Finally, we extend our analysis to a directed acyclic graph (DAG) task model where each node in the DAG has a unit execution requirement. We show how these tasks can be converted into synchronous tasks such that the same decomposition can be applied and the same augmentation bounds hold. Simulations based on synthetic workload demonstrate that the derived resource augmentation bounds are safe and sufficient.  相似文献   

8.
The amount of memory required by a parallel program may be spectacularly larger than the memory required by an equivalent sequential program, particularly for programs that use recursion extensively. Since most parallel programs are nondeterministic in behavior, even when computing a deterministic result, parallel memory requirements may vary from run to run, even with the same data. Hence, parallel memory requirements may be both large (relative to memory requirements of an equivalent sequential program) and unpredictable. Assume that each parallel program has an underlying sequential execution order that may be used as a basis for predicting parallel memory requirements. We propose a simple restriction that is sufficient to ensure that any program that will run in n units of memory sequentially can run in mn units of memory on m processors, using a scheduling algorithm that is always within a factor of two of being optimal with respect to time. Any program can be transformed into one that satisfies the restriction, but some potential parallelism may be lost in the transformation. Alternatively, it is possible to define a parallel programming language in which only programs satisfying the restriction can be written  相似文献   

9.
The multiprocessor scheduling problem is the problem of scheduling the tasks of a precedence constrained task graph (representing a parallel program) onto the processors of a multiprocessor in a way that minimizes the completion time. Since this problem is known to be NP-hard in the strong sense in all but a few very restricted eases, heuristic algorithms are being developed which obtain near optimal schedules in a reasonable amount of computation time. We present an efficient heuristic algorithm for scheduling precedence constrained task graphs with nonnegligible intertask communication onto multiprocessors taking contention in the communication channels into consideration. Our algorithm for obtaining satisfactory suboptimal schedules is based on the classical list scheduling strategy. It simultaneously exploits the schedule-holes generated in the processors and in the communication channels during the scheduling process in order to produce better schedules. We demonstrate the effectiveness of our algorithm by comparing with two competing heuristic algorithms available in the literature  相似文献   

10.
多机相关任务的相关矩阵调度算法   总被引:6,自引:0,他引:6  
王凤儒  张淑丽 《计算机学报》1998,21(10):933-938
本文讨论了多机相关任务的调度问题,从时间和空间两方面考虑,提出了一种新的多机相关任务的调度算法-多机相关任务的相关矩阵调度算法(RMSA)。利用可变的相关矩阵Mu,表示任务的空间需求与处理机的局部存储空间的关系以及任务分配的状态。实验表明此算法具有较短的调度长度,并且具有较好的时间均衡性和空间协调性。  相似文献   

11.
On exploiting task duplication in parallel program scheduling   总被引:1,自引:0,他引:1  
One of the main obstacles in obtaining high performance from message-passing multicomputer systems is the inevitable communication overhead which is incurred when tasks executing on different processors exchange data. Given a task graph, duplication-based scheduling can mitigate this overhead by allocating some of the tasks redundantly on more than one processor. In this paper, we focus on the problem of using duplication in static scheduling of task graphs on parallel and distributed systems. We discuss five previously proposed algorithms and examine their merits and demerits. We describe some of the essential principles for exploiting duplication in a more useful manner and, based on these principles, propose an algorithm which outperforms the previous algorithms. The proposed algorithm generates optimal solutions for a number of task graphs. The algorithm assumes an unbounded number of processors. For scheduling on a bounded number of processors, we propose a second algorithm which controls the degree of duplication according to the number of available processors. The proposed algorithms are analytically and experimentally evaluated and are also compared with the previous algorithms  相似文献   

12.
调度Fork-Join任务图的贪心算法   总被引:1,自引:2,他引:1  
任务调度算法的目标是把组成并行程序的一组任务分配到多个处理器以使得程序的完成时间最短,这是一个NP完全问题.虽然许多算法在任务满足某些条件时能产生最优调度,但大多都忽略了节省处理器个数和最小化程序总的完成时间等问题.Fork-Join结构是一种并行处理的基本结构.因此,专门针对Fork-Join任务图,提出了一个能产生最优调度的新的贪心调度算法,该算法具有高的加速比和总体效率,时间复杂度为O(v2),其中,v表示任务集中任务的个数.实验结果表明,相比其它算法,该算法具有较短的调度长度、较短的完成时间,使用的处理器数较少.  相似文献   

13.
章军  章立生  韩承德 《软件学报》1999,10(11):1156-1162
在分布式内存多处理机DMM(distributed memory multiprocessor)系统中,不同处理机上运行的任务之间的通信开销仍然很大,有时甚至抵消了多处理机并行所带来的好处.为了使并行程序在DMM系统上能得以高效的执行,必须采用合理的调度技术将任务分配给处理机.文章首先分别给出了任务调度系统中的任务模型、处理机模型以及调度问题的形式化描述,然后在此基础上研究了任务调度中3个最重要的问题,即(1) 如何顺序选择参与调度的任务,(2) 如何选择路由,(3) 如何分配任务给处理机.其中,路由选择  相似文献   

14.
The growth of energy consumption has been explosive in current data centers, super computers, and public cloud systems. This explosion has led to greater advocacy of green computing, and many efforts and works focus on the task scheduling in order to reduce energy dissipation. In order to obtain more energy reduction as well as maintain the quality of service by meeting the deadlines, this paper proposes a DVFS-enabled Energy-efficient Workflow Task Scheduling algorithm: DEWTS. Through merging the relatively inefficient processors by reclaiming the slack time, DEWTS can leverage the useful slack time recurrently after severs are merged. DEWTS firstly calculates the initial scheduling order of all tasks, and obtains the whole makespan and deadline based on Heterogeneous-Earliest-Finish-Time (HEFT) algorithm. Through resorting the processors with their running task number and energy utilization, the underutilized processors can be merged by closing the last node and redistributing the assigned tasks on it. Finally, in the task slacking phase, the tasks can be distributed in the idle slots under a lower voltage and frequency using DVFS technique, without violating the dependency constraints and increasing the slacked makespan. Based on the amount of randomly generated DAGs workflows, the experimental results show that DEWTS can reduce the total power consumption by up to 46.5 % for various parallel applications as well as balance the scheduling performance.  相似文献   

15.
One of the most important problems arising in multiprocessor systems is scheduling of tasks on a set of parallel processors. Recently, new models of task processing have been formulated in which certain tasks can require more than one processor at a time. This model is especially justified in some applications of multi-microprocessor systems. In this paper, we extend the above model to cover the case of scheduling in the presence of additional scarce resources. First a subcase of the problem is considered in which preemptable tasks need simultaneously one or two processors and one additional resource in the amount of one unit. For this case a low order polynomial-time algorithm is presented. Then the general case is solved via a linear programming approach.  相似文献   

16.
人工智能的飞速发展对高性能计算提出了更高的要求,异构计算环境下任务调度问题一直是高性能计算中的关键问题.本文提出一种基于优先队列划分的调度算法(PQDSA),该算法根据DAG(有向无循环图)任务集的入口节点数量确定优先队列数,通过任务的通信开销和计算开销划分任务队列,进而将关键节点任务分配给合适的队列,以产生效果较佳的任务调度队列,从而提高任务间的并行性,降低任务集的完工时间.与此同时,进一步基于插入策略将任务调度到处理器上,使任务调度更加高效地执行.PQDSA算法可以减少任务间的时间消耗,提高处理器的调度效率.通过与两个经典算法的性能对比,实验结果表明本文提出的PQDSA算法在任务完工时间和调度效率方面都要明显优于对比的算法.  相似文献   

17.
关联任务在多核处理器上并行调度所产生的通信时延,会对任务调度长度和处理器利用率造成负面影响,为了改善多核系统对关联任务的处理性能,针对关联任务在多核处理器上的调度特点,提出一种并行感知调度算法。计算各任务与终点间的最长路径值,按照该值的降序来分配任务调度次序,在分配处理器内核时兼顾关联度和任务最早可执行时间,设置最佳匹配评价函数。实验结果表明,与busHEFT和DTSV算法相比,该算法具有更短的任务调度时延、更少的通信量以及更高的处理器利用率。  相似文献   

18.
This paper presents a hybrid scheduling methodology for task graphs to multiprocessor embedded systems. The proposed methodology is designed for task graphs which are dynamic in nature due to the presence of conditional tasks as well as tasks whose execution times are unpredictable but bounded. We have presented the methodology as a three phase strategy in which task nodes are mapped to the processors in the first (static mapping) phase. In the second (selective duplication) phase some critical nodes are identified and duplicated for possible rescheduling at run-time depending on the code memory constraints of the processors. The third (online) phase is a run-time scheduling algorithm that performs list scheduling based on actual dynamics of the schedule up to the current time. We show that this technique provides better schedule length (up to 20%) compared to previous techniques which are predominantly static in nature with low overhead and comparable in complexity with existing online techniques. The effects of model parameters like number of processors, memory and various task graph parameters on performance are investigated in this paper.  相似文献   

19.
Summary The problem to be considered is one of scheduling nonpreemptable tasks in multiprocessor systems when tasks need for their processing processors and other limited resources, and when mean flow time is the system performance measure. For each task the time required for its processing and the amount of each resource which it requires, are given. Special attention is paid to the computational complexity of algorithms for determining the optimal schedules for different assumptions concerning the environment. For the case of scheduling independent, arbitrary length tasks when each task may require a unit of an additional resource of one type, an O(n 3) algorithm is given. For more complicated resource requirements, however, it is proved that the problem under consideration is NP-hard in the strong sense, even for the case of two processors.  相似文献   

20.
多核平台下XEN虚拟机动态调度算法研究   总被引:1,自引:0,他引:1  
虚拟机调度算法对并行任务的执行效率考虑不够充分。现代处理器平台具备了多个可用的计算核心,使多个虚拟机并发执行成为了现实。针对多核平台下的并行虚拟机调度优化问题,提出一种基于任务特征虚拟机CON-Credit调度算法。该算法在调度并行任务时,使用动态方式对计算机核心进行分配,采用传统的虚拟机调度算法为执行普通任务的虚拟机进行分配;采用定制的同步算法给执行并行任务的虚拟机分进分配。相关实验显示,CON-Credit调度算法能显著提高并行任务的执行效率。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号