首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A compact task graph representation for real-time scheduling   总被引:1,自引:0,他引:1  
A new task graph representation, namely the compact task graph (CTG), is developed to aid in the scheduling of a set of communicating periodic real-time tasks. This representation explicitly expresses the potential for parallelism across tasks as well as the idle times that may be encountered within application tasks. A CTG based scheduler can generate schedules that are able to meet deadlines by interleaving the execution of tasks on a single processor and/or overlapping the execution of tasks on multiple processors. The construction of a CTG is based upon the busy-idle execution profiles for the tasks generated by the compiler. The profiles are computed assuming that sufficient resources are available for parallel execution of all tasks. Thus, they expose all opportunities for overlapped and interleaved execution. The compiler analyzes the profiles to identify useful opportunities for interleaving and expresses them in the CTG without unnecessary partitioning of the tasks. The CTG is powerful because it expresses schedules that are not expressed by existing approaches for constructing task graphs. Schedules can be generated efficiently since a CTG's construction minimizes the splitting of tasks. We briefly demonstrate the usefulness of CTGs in scheduling real-time tasks and scheduling monitoring tasks without affecting the timing of application tasks.Supported in part by the National Science Foundation through a Presidential Young Investigator Award CCR-9157371 and Grant CCR-9212020.  相似文献   

2.
Previous standby-sparing techniques assume that all tasks don't access to shared resources. In addition, primary tasks and backup tasks are allocated to the primary processor and spare processor respectively. Spare processor schedules tasks with maximum processor speed. Unlike previous techniques, we have studied the problem of minimizing energy consumption and preserving the original reliability for dynamic-priority real-time task set with shared resources in a standby-sparing system. We propose a novel energy-aware mixed partitioning scheduling algorithm (EAMPSA). Earliest deadline first/dynamic deadline modification (EDF/DDM) scheduling scheme is used to ensure that the shared resources can be accessed in a mutual exclusive manner. Uniformly speed is used to the primary processor and the spare processor. In addition, we use the mixed mapping partitioning of primary and backup tasks method to map tasks. A novel method of mapping task is proposed i.e. the tasks which need to access to shared resources are mapped into the primary processor and the tasks which have no resource requirements are mapped into the spare processor. Furthermore, DVS and DPM techniques are used for both primary and backup tasks to save energy. The experimental results show that the EAMPSA algorithm consumes average 55.43% less energy than that of the SSPT algorithm.  相似文献   

3.
Complex parallel applications can often be modeled as directed acyclic graphs of coarse-grained application tasks with dependences. These applications exhibit both task and data parallelism, and combining these two (also called mixed parallelism) has been shown to be an effective model for their execution. In this paper, we present an algorithm to compute the appropriate mix of task and data parallelism required to minimize the parallel completion time (makespan) of these applications. In other words, our algorithm determines the set of tasks that should be run concurrently and the number of processors to be allocated to each task. The processor allocation and scheduling decisions are made in an integrated manner and are based on several factors such as the structure of the task graph, the runtime estimates and scalability characteristics of the tasks, and the intertask data communication volumes. A locality-conscious scheduling strategy is used to improve intertask data reuse. Evaluation through simulations and actual executions of task graphs derived from real applications and synthetic graphs shows that our algorithm consistently generates schedules with a lower makespan as compared to Critical Path Reduction (CPR) and Critical Path and Allocation (CPA), two previously proposed scheduling algorithms. Our algorithm also produces schedules that have a lower makespan than pure task- and data-parallel schedules. For task graphs with known optimal schedules or lower bounds on the makespan, our algorithm generates schedules that are closer to the optima than other scheduling approaches.  相似文献   

4.
On parallelizing the multiprocessor scheduling problem   总被引:1,自引:0,他引:1  
Existing heuristics for scheduling a node and edge weighted directed task graph to multiple processors can produce satisfactory solutions but incur high time complexities, which tend to exacerbate in more realistic environments with relaxed assumptions. Consequently, these heuristics do not scale well and cannot handle problems of moderate sizes. A natural approach to reducing complexity, while aiming for a similar or potentially better solution, is to parallelize the scheduling algorithm. This can be done by partitioning the task graphs and concurrently generating partial schedules for the partitioned parts, which are then concatenated to obtain the final schedule. The problem, however, is nontrivial as there exists dependencies among the nodes of a task graph which must be preserved for generating a valid schedule. Moreover, the time clock for scheduling is global for all the processors (that are executing the parallel scheduling algorithm), making the inherent parallelism invisible. In this paper, we introduce a parallel algorithm that is guided by a systematic partitioning of the task graph to perform scheduling using multiple processors. The algorithm schedules both the tasks and messages, and is suitable for graphs with arbitrary computation and communication costs, and is applicable to systems with arbitrary network topologies using homogeneous or heterogeneous processors. We have implemented the algorithm on the Intel Paragon and compared it with three closely related algorithms. The experimental results indicate that our algorithm yields higher quality solutions while using an order of magnitude smaller scheduling times. The algorithm also exhibits an interesting trade-off between the solution quality and speedup while scaling well with the problem size  相似文献   

5.
In this paper, a processor allocation mechanism for NoC-based chip multiprocessors is presented. Processor allocation is a well-known problem in parallel computer systems and aims to allocate the processing nodes of a multiprocessor to different tasks of an input application at run time. The proposed mechanism targets optimizing the on-chip communication power/latency and relies on two procedures: processor allocation and task migration. Allocation is done by a fast heuristic algorithm to allocate the free processors to the tasks of an incoming application when a new application begins execution. The task-migration algorithm is activated when some application completes execution and frees up the allocated resources. Task migration uses the recently deallocated processors and tries to rearrange the current tasks in order to find a better mapping for them. The proposed method can also capture the dynamic traffic pattern of the network and perform task migration based on the current communication demands of the tasks. Consequently, task migration adapts the task mapping to the current network status. We adopt a non-contiguous processor allocation strategy in which the tasks of the input application are allowed to be mapped onto disjoint regions (groups of processors) of the network. We then use virtual point-to-point circuits, a state-of-the-art fast on-chip connection designed for network-on-chips, to virtually connect the disjoint regions and make the communication latency/power closer to the values offered by contiguous allocation schemes. The experimental results show considerable improvement over existing allocation mechanisms.  相似文献   

6.
In this paper, we present two heuristic energy-aware scheduling algorithms (EGMS and EGMSIV) for scheduling task precedence graphs in an embedded multiprocessor system having processing elements with dynamic voltage scaling capabilities. Unlike most energy-aware scheduling algorithms that consider task ordering and voltage scaling separately from task mapping, our algorithms consider them in an integrated way. EGMS uses the concept of energy gradient to select tasks to be mapped onto new processors and voltage levels. EGM-SIV extends EGMS by introducing intra-task voltage scaling using a Linear Programming (LP) formulation to further reduce the energy consumption. Through rigorous simulations, we compare the performance of our proposed algorithms with a few approaches presented in the literature. The results demonstrate that our algorithms are capable of obtaining energy-efficient schedules using less optimization time. On the average, our algorithms produce schedules which consume 10% less energy with more than 47% reduction in optimization time when compared to a few approaches presented in the literature. In particular, our algorithms perform better in generating energy-efficient schedules for larger task graphs. Our results show a reduction of up to 57% in energy consumption for larger task graphs compared to other approaches.  相似文献   

7.
The authors present a compile-time scheduling heuristic called dynamic level scheduling, which accounts for interprocessor communication overhead when mapping precedence-constrained, communicating tasks onto heterogeneous processor architectures with limited or possibly irregular interconnection structures. This technique uses dynamically-changing priorities to match tasks with processors at each step, and schedules over both spatial and temporal dimensions to eliminate shared resource contention. This method is fast, flexible, widely targetable, and displays promising performance  相似文献   

8.
Effective load distribution and resource management is of great importance in designing complex distributed systems as grid. This pre-assumes the capability of partitioning the arriving jobs into independent tasks that can be executed simultaneously, assigning the tasks to processors and scheduling the task execution on each processor. A simulation model, consisting of two homogeneous clusters, is considered to evaluate the performance for various workloads. The Deferred policy is applied to collect global system information about processor queues. This paper proposes a special scheduling method referred to as task clustering method. We examine the efficiency of two task routing policies – one static and one adaptive – and six task scheduling policies, which rearrange processor queues regarding to a criterion. Our simulation results indicate that the adaptive task routing policy in conjunction with SGFS-ST scheduling algorithm, which uses more efficiently the task clustering method, leads to a significant performance improvement.  相似文献   

9.
在一组相同处理器上调度带有通信延迟的任务图以实现其最短的执行时间,这在并行计算的调度理论和实践中具有重要的意义。针对具有通信延迟的任务图调度问题,提出一种基于可满足性模理论(SMT)的改进SMT方法。首先,将处理器映射约束和任务执行顺序等约束条件进行编码,将任务图调度问题转化为SMT问题;然后,调用SMT求解器对可行解空间进行搜索,以确定问题最优解。在约束编码阶段,使用整型变量表示任务和处理器的映射关系,从而降低处理器约束编码的复杂程度;在求解器调用阶段,通过添加独立任务的约束条件减小求解器的搜索空间,进一步提升最优解的查找效率。实验结果表明,与原始SMT方法相比,改进SMT方法在20 s和1 min超时实验中的平均求解时间分别减少了65.9%与53.8%,并且在处理器数量较多时取得了更大的效率优势。改进的SMT方法可以有效求解带通信延迟的任务图调度问题,尤其适用于处理器数量较多的调度场景。  相似文献   

10.
We study the problem of scheduling independent multiprocessor tasks, where for each task in addition to the processing time(s) there is a prespecified dedicated subset (or a family of alternative subsets) of processors which are required to process the task simultaneously. Focusing on problems where all required (alternative) subsets of processors have the same fixed cardinality, we present complexity results for computing preemptive schedules with minimum makespan closing the gap between computationally tractable and intractable instances. In particular, we show that for the dedicated version of the problem, optimal preemptive schedules of bi-processor tasks (i.e., tasks whose dedicated processor sets are all of cardinality two) can be computed in polynomial time. We give various extensions of this result including one to maximum lateness minimization with release times and due dates. All these results are based on a nice relation between preemptive scheduling and fractional coloring of graphs. In contrast to the positive results, we also prove that the problems of computing optimal preemptive schedules for three-processor tasks or for bi-processor tasks with (possible several) alternative modes are strongly NP-hard.  相似文献   

11.
《Computer Networks》1999,31(1-2):141-150
Multimedia applications require support from the underlying broadband network at the end-to-end communication level. Multicasting is an important paradigm of end-to-end communication. The root node of a multicasting session is responsible for controlling the session including monitoring, maintenance, and the implementation of the multicasting protocol. The job that controls the multicasting session executes as a group of tasks at the root node of a multicasting tree. The scheduling scheme at the root node should give support to a multicasting session by improving the completion time of the jobs controlling the multicasting session, hence increasing throughput and the probability of admitting new multicast sessions into the system. In this paper, we model the tasks that carry out the multicasting session monitoring and maintenance as a fork-join job executing on a multiprocessor system. We assume that an executing task blocks for device I/O as a part of the activities associated with sending and receiving data packets. We develop two analytic models for scheduling a session control job on a multiprocessor system. The first model allows incoming job tasks to multiplex processors with existing tasks of another multicasting session, while the other model schedules a task of the incoming job to an idle processor. We assume that the overhead of rescheduling a task to another processor is large. We compare the performance of both models and show the range of conditions under which a model outperforms the other. We point out how the results can be used in the design of an adaptive scheduler that uses both models to improve throughput and consequently the probability of admitting new multicast sessions.  相似文献   

12.
针对流水线结构的网络处理器上包处理任务图的调度,提出带有负载平衡的最早完成时间调度算法EFT—LB,有效地平衡了处理器流水线中每个处理器的负载,增大了流水线系统的吞吐量。相关对比试验的结果表明系统性能有了明显提高。  相似文献   

13.
一个调度Fork-Join任务图的新算法   总被引:17,自引:1,他引:16  
刘振英  方滨兴  姜誉  张毅  赵宏 《软件学报》2002,13(4):693-697
任务调度是影响工作站网络效率的关键因素之一.Fork-Join任务图可以代表很多并行结构,但其他已有调度Fork-Join任务图算法忽略了在非全互连工作站网络环境中通信之间不能并行执行的问题,有些效率高的算法又没有考虑节省处理器个数的问题.因此,专门针对该任务图,综合考虑调度长度、非并行通信和节省处理器个数问题,提出了一个基于任务复制的静态调度算法TSA_FJ.通过随机产生任务的执行时间和通信时间,生成了多个Fork-Join任务图,并且采用TSA_FJ算法和其他调度算法对生成的任务图进行调度.结果表明,  相似文献   

14.
对基于总线的机群系统,本文提出了一种基于任务复制的调度Fork-Join任务图的新算法。该算法通过任务集划分计算调度长度,并在不增加调度长度的同时将任务尽可能调度在已用处理器上,节省处理器数。新算法的时间复杂度高于现有算法,但其调度性能最优。  相似文献   

15.
Algorithms for scheduling independent tasks on to the processors of a multiprocessor system must trade-off processor load balance, memory locality, and scheduling overhead. Most existing algorithms, however, do not adequately balance these conflicting factors. This paper introduces the self-adjusting dynamic scheduling (SADS) class of algorithms that use a unified cost model to explicitly account for these factors at runtime. A dedicated processor performs scheduling in phases by maintaining a tree of partial schedules and incrementally assigning tasks to the least-cost schedule. A scheduling phase terminates whenever any processor becomes idle, at which time partial schedules are distributed to the processors. An extension of the basic SADS algorithm, called DBSADS, controls the scheduling overhead by giving higher priority to partial schedules with more task-to-processor assignments. These algorithms are compared to two distributed scheduling algorithms within a database application on an Intel Paragon distributed memory multiprocessor system.  相似文献   

16.
一种新的分布式控制系统容错调度算法   总被引:3,自引:3,他引:0       下载免费PDF全文
目前多数容错调度算法在调度非周期任务时采用预留时间的方法,非周期任务无法得到充分响应。针对该问题,提出一种新的分布式控制系统容错调度算法,采用任务集划分的方法在不同处理机上运行不同的周期任务子集,使每个处理机具有不同的非周期任务预留时间,当非周期任务发生时,即可得到有效响应。结果表明,该方法能提高容错调度的效率。  相似文献   

17.
Programming models for distributed systems often construct a task graph for the program to be executed on a distributed system of processors. While the topology of the task graph can be constructed from the program structure, often the task execution times and data transfer costs between tasks depend on the input data, or more specifically, on the particular problem instance. Though this indicates that the optimal schedule of a task graph cannot be determined until the input data is available, it is possible to estimate theworst caseprocessor requirement for the optimal schedule of a program solely from the topology of its task graph. In this paper, we study the problem of estimating worst case processor requirements for scheduling (with cloning) layered task graphs based on their topology. We show that computing an accurate processor bound for layered graphs is NP hard (even for two layers) and present a polynomial time algorithm which computes an upper bound on the processor requirement. We show that the algorithm provides tight bounds for several common classes of layered task graphs.  相似文献   

18.
Contention-aware scheduling with task duplication   总被引:1,自引:0,他引:1  
Finding an efficient schedule for a task graph on several processors is a trade-off between maximising concurrency and minimising interprocessor communication. Task duplication is a technique that has been employed to reduce or avoid interprocessor communication. Certain tasks are duplicated on several processors to produce the data locally and avoid the communication among processors. Most of the algorithms using task duplication have been proposed for the classic scheduling model, which allows concurrent communication and ignores contention for communication resources. It is increasingly recognised that this classic model is unrealistic and does not permit creating accurate and efficient schedules. The recently proposed contention model introduces contention awareness into task scheduling by assigning the edges of the task graph to the links of the communication network. It is intuitive that scheduling under such a model benefits even more from task duplication, yet no such algorithm has been proposed as it is not trivial to duplicate tasks under the contention model. This paper proposes a contention-aware task duplication scheduling algorithm. We investigate the fundamentals for task duplication in the contention model and propose an algorithm that is based on state-of-the-art techniques found in task duplication and contention-aware algorithms. An extensive experimental evaluation demonstrates the significant improvements to the speedup of the produced schedules.  相似文献   

19.
Oh  Dong-Ik  Bakker  T.P. 《Real-Time Systems》1998,15(2):183-192
We consider the schedulability of a set of independent periodic tasks under fixed priority preemptive scheduling on homogeneous multiprocessor systems. Assuming there is no task migration between processors and each processor schedules tasks preemptively according to fixed priorities assigned by the Rate Monotonic policy, the scheduling problem reduces to assigning the set of tasks to disjoint processors in such a way that the Monotonic policy, the scheduling problem reduces to assigning the set of tasks to disjoint processors in such a way that the schedulability of the tasks on each processor can be guaranteed. In this paper we show that the worst case achievable utilization for such systems is between n(21/2-1) and (n+1)/(1+21/(n+1)), where n stands for the number of processors. The lower bound represents 41 percent of the total system capacity and the upper bound represents 50 to 66 percent depending on n. Practicality of the lower bound is demonstrated by proving it can be achieved using a First Fit scheduling algorithm.  相似文献   

20.
In this paper, we study the problem of optimizing the throughput of coarse-grain workflow applications, for which each task of the workflow is of a given type, and subject to failures. The goal is to map such an application onto a heterogeneous specialized platform, which consists of a set of processors that can be specialized to process one type of tasks. The objective function is to maximize the throughput of the workflow, i.e., the rate at which the data sets can enter the system. If there is exactly one task per processor in the mapping, then we prove that the optimal solution can be computed in polynomial time. However, the problem becomes NP-hard if several tasks can be assigned to the same processor. Several polynomial time heuristics are presented for the most realistic specialized setting, in which tasks of the same type can be mapped onto the same processor, but a processor cannot process two tasks of different types. Also, we give an integer linear program formulation of this problem, which allows us to find the optimal solution (in exponential time) for small problem instances. Experimental results show that the best heuristics obtain a good throughput, much better than the throughput obtained with a random mapping. Moreover, we obtain a throughput close to the optimal solution in the particular cases on which the optimal throughput can be computed (small problem instances or particular mappings).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号