首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Communication contention in task scheduling   总被引:4,自引:0,他引:4  
Task scheduling is an essential aspect of parallel programming. Most heuristics for this NP-hard problem are based on a simple system model that assumes fully connected processors and concurrent interprocessor communication. Hence, contention for communication resources is not considered in task scheduling, yet it has a strong influence on the execution time of a parallel program. This paper investigates the incorporation of contention awareness into task scheduling. A new system model for task scheduling is proposed, allowing us to capture both end-point and network contention. To achieve this, the communication network is reflected by a topology graph for the representation of arbitrary static and dynamic networks. The contention awareness is accomplished by scheduling the communications, represented by the edges in the task graph, onto the links of the topology graph. Edge scheduling is theoretically analyzed, including aspects like heterogeneity, routing, and causality. The proposed contention-aware scheduling preserves the theoretical basis of task scheduling. It is shown how classic list scheduling is easily extended to this more accurate system model. Experimental results show the significantly improved accuracy and efficiency of the produced schedules.  相似文献   

2.
Toward a realistic task scheduling model   总被引:2,自引:0,他引:2  
Task scheduling is an important aspect of parallel programming. Most of the heuristics for this NP-hard problem are based on a very simple system model of the target parallel system. Experiments revealed the inappropriateness of this classic model to obtain accurate and efficient schedules for real-systems. In order to overcome this shortcoming, a new scheduling model was proposed that considers the contention for communication resources. Even though the accuracy and efficiency improved with the consideration of contention, the new contention model is still not good enough. The crucial aspect is the involvement of the processor in communication. This paper investigates the involvement of the processor in communication and its impact on task scheduling. A new system model is proposed based on the contention model that is aware of the processor involvement. The challenges for the scheduling techniques are analyzed and two scheduling algorithms are proposed. Experiments on real parallel systems show the significantly improved accuracy and efficiency of the new model and algorithms.  相似文献   

3.
This paper presents a neighborhood search algorithm for heterogeneous multiprocessor scheduling in which loop pipelining is used to exploit parallelism between iterations. The method adopts a realistic model for interprocessor communication where resource contention is taken into consideration. The schedule representation scheme is flexible so that communication scheduling can be performed in a generic manner. Base on a general time formulation of the schedule performance, the algorithm improves an initial schedule in an efficient way by successive modification to the task processor mapping and task ordering. Simulation results show that significant improvement over existing methods can be obtained. A parallel software video encoder was implemented based on the scheduling result and real time performance was achieved with pipelining of frame encoding.  相似文献   

4.
The problem of exploiting the effective utilization of a multiprocessor system essentially depends on optimal scheduling of parallel tasks onto the processors in the system. A recently proposed compile-time scheduling algorithm based on the 0–1 linear programming algorithm with the branch and bound technique, to produce optimal schedules, has the problems of communication link contention, nonoptimality, and modeling incompletely connected processor systems. In this paper, we present a modified version of this algorithm for producing contention-free optimal schedules for any arbitrary multiprocessor topology. To alleviate the impediments of large requirements of CPU time for the optimal scheduling algorithm, we have developed three new effective techniques, namely,processor isomorphism, look ahead pruning, andlower bound on the completion time of tasks. The performance of our algorithm is analyzed using DFT and LU decomposition methods as benchmarks.  相似文献   

5.
Contention-aware scheduling with task duplication   总被引:1,自引:0,他引:1  
Finding an efficient schedule for a task graph on several processors is a trade-off between maximising concurrency and minimising interprocessor communication. Task duplication is a technique that has been employed to reduce or avoid interprocessor communication. Certain tasks are duplicated on several processors to produce the data locally and avoid the communication among processors. Most of the algorithms using task duplication have been proposed for the classic scheduling model, which allows concurrent communication and ignores contention for communication resources. It is increasingly recognised that this classic model is unrealistic and does not permit creating accurate and efficient schedules. The recently proposed contention model introduces contention awareness into task scheduling by assigning the edges of the task graph to the links of the communication network. It is intuitive that scheduling under such a model benefits even more from task duplication, yet no such algorithm has been proposed as it is not trivial to duplicate tasks under the contention model. This paper proposes a contention-aware task duplication scheduling algorithm. We investigate the fundamentals for task duplication in the contention model and propose an algorithm that is based on state-of-the-art techniques found in task duplication and contention-aware algorithms. An extensive experimental evaluation demonstrates the significant improvements to the speedup of the produced schedules.  相似文献   

6.
基于DAG的静态任务调度算法已有深入的研究及应用.目前的调度算法大多假定处理器之间可以并行接收数据,而没有考虑实际应用中通信链路的竞争及延迟,进而导致调度算法在具体应用中效率较低.侧重研究同构计算环境下具有依赖关系任务的边调度问题,结合传统任务调度问题中的有效策略,提出基于优化插入的调度算法(OISA).OISA根据实际问题的具体特征,采用改进的路由算法选择负载较少的数据链路,并通过形式化的证明以优化通信数据在链路的开始传输时间,以达到降低调度长度的目的.通过试验测试表明,OISA在性能上明显优于目前已有的相关算法.  相似文献   

7.
任务调度是并行处理的一个非常关键的方面。目前的调度算法大多假定处理器完全互连、可以并行接收数据,而没有考虑实际应用中通信链路的竞争及延迟,进而导致调度算法在具体应用中效率较低。论文研究异构计算环境下具有依赖关系任务的边调度问题,结合传统任务调度问题中的有效策略,提出一种新的调度算法,该算法通过串行化通信边使通信竞争集成化。实验结果表明,与各种经典调度方案相比,该算法显著地改善了精确性和效率。  相似文献   

8.
一个调度Fork-Join任务图的新算法   总被引:16,自引:1,他引:16  
刘振英  方滨兴  姜誉  张毅  赵宏 《软件学报》2002,13(4):693-697
任务调度是影响工作站网络效率的关键因素之一.Fork-Join任务图可以代表很多并行结构,但其他已有调度Fork-Join任务图算法忽略了在非全互连工作站网络环境中通信之间不能并行执行的问题,有些效率高的算法又没有考虑节省处理器个数的问题.因此,专门针对该任务图,综合考虑调度长度、非并行通信和节省处理器个数问题,提出了一个基于任务复制的静态调度算法TSA_FJ.通过随机产生任务的执行时间和通信时间,生成了多个Fork-Join任务图,并且采用TSA_FJ算法和其他调度算法对生成的任务图进行调度.结果表明,  相似文献   

9.
3D modeling and codec of real objects are hot issues in the field of virtual reality. In this paper, we propose an automatic registration two range images method and a cycle based automatic global registration algorithm for rapidly and automatically registering all range images and constructing a realistic 3D model. Besides, to meet the requirement of huge data transmission over Internet, we present a 3D mesh encoding/decoding method for encoding geometry, topology and attribute data with high compression ratio and supporting progressive transmission. The research results have already been applied successfully in digital museum. Supported by the National Natural Science Foundation of China (Grant Nos. 60533070, 60773153), the Key Grant Project of Chinese Ministry of Education (Grant No. 308004), the Project of Chinese Ministry of Science and Technology (Grant No. 2006BAK12B09), and the Project of Beijing Municipal Science and Technology Commission (Grant No. Z07000100560714)  相似文献   

10.
The multiprocessor scheduling problem is the problem of scheduling the tasks of a precedence constrained task graph (representing a parallel program) onto the processors of a multiprocessor in a way that minimizes the completion time. Since this problem is known to be NP-hard in the strong sense in all but a few very restricted eases, heuristic algorithms are being developed which obtain near optimal schedules in a reasonable amount of computation time. We present an efficient heuristic algorithm for scheduling precedence constrained task graphs with nonnegligible intertask communication onto multiprocessors taking contention in the communication channels into consideration. Our algorithm for obtaining satisfactory suboptimal schedules is based on the classical list scheduling strategy. It simultaneously exploits the schedule-holes generated in the processors and in the communication channels during the scheduling process in order to produce better schedules. We demonstrate the effectiveness of our algorithm by comparing with two competing heuristic algorithms available in the literature  相似文献   

11.
针对异构多核处理器间的任务调度问题,为了更好地发挥异构多核处理器间的平台优势,提出一种基于将有关联的且不在同一处理器上的任务进行复制的思想,从而使每个异构多核的处理器能独立执行任务,来减少不同处理器之间的通信开销,并且通过混合粒子群算法(HPSO)来调度异构多核处理器中的任务,避免由于当任意一个异构多核处理器由于任务分配过多而导致计算机不能及时且准确地得出结果.最后实验证明,对比传统的启发式分配方案和常见的遗传算法(GA),基于任务复制思想分配方案和混合粒子群算法(HPSO)具有更好的求解能力,并且可以提供执行时间更少的调度分配方案,具有较好的应用价值.  相似文献   

12.
The purpose of this paper is to examine the impact of scheduling parallel tasks onto non-uniform memory access (NUMA) shared-memory multiprocessor systems by considering non-negligible intertask communications and communication contentions. Communication contentions arise from the communication medium having insufficient capacity to serve all transmissions, thereby causing significant contention delays. Therefore, a new scheduling algorithm, herein referred to as the Extended Critical Path (ECP) algorithm is proposed. The proposed algorithm schedules parallel tasks by considering non-negligible intertask communications and the contentions among shared communication resources. Moreover, it ensures performance within a factor of two of the optimum for general directed acyclic task graphs (DATGs). Experimental results demonstrate the superiority of the ECP algorithm over the scheduling algorithms presented in previous literature.  相似文献   

13.
关联任务在多核处理器上并行调度所产生的通信时延,会对任务调度长度和处理器利用率造成负面影响,为了改善多核系统对关联任务的处理性能,针对关联任务在多核处理器上的调度特点,提出一种并行感知调度算法。计算各任务与终点间的最长路径值,按照该值的降序来分配任务调度次序,在分配处理器内核时兼顾关联度和任务最早可执行时间,设置最佳匹配评价函数。实验结果表明,与busHEFT和DTSV算法相比,该算法具有更短的任务调度时延、更少的通信量以及更高的处理器利用率。  相似文献   

14.
现代并行系统的复杂调度问题可以转化为Fork-join图的任务调度问题.然而在实际计算环境中,两个处理节点之间的通信大多以独占方式进行,现有的大多数任务调度算法往往忽略了对通信信道独占性的考虑.提出了一种带通信限制的Fork-join图调度算法CCTD.该算法引入了实际环境中的通信独占性限制,同时保证了Fork-join图的基于复制的优化调度,而且尽可能地减少了对处理器占用.实验结果表明,CCTD算法是一种适应性强的、高效的Fork-join图调度算法.  相似文献   

15.
在一组相同处理器上调度带有通信延迟的任务图以实现其最短的执行时间,这在并行计算的调度理论和实践中具有重要的意义。针对具有通信延迟的任务图调度问题,提出一种基于可满足性模理论(SMT)的改进SMT方法。首先,将处理器映射约束和任务执行顺序等约束条件进行编码,将任务图调度问题转化为SMT问题;然后,调用SMT求解器对可行解空间进行搜索,以确定问题最优解。在约束编码阶段,使用整型变量表示任务和处理器的映射关系,从而降低处理器约束编码的复杂程度;在求解器调用阶段,通过添加独立任务的约束条件减小求解器的搜索空间,进一步提升最优解的查找效率。实验结果表明,与原始SMT方法相比,改进SMT方法在20 s和1 min超时实验中的平均求解时间分别减少了65.9%与53.8%,并且在处理器数量较多时取得了更大的效率优势。改进的SMT方法可以有效求解带通信延迟的任务图调度问题,尤其适用于处理器数量较多的调度场景。  相似文献   

16.
In many applications, XML documents need to be modelled as graphs. The query processing of graph-structured XML documents brings new challenges. In this paper, we design a method based on labelling scheme for structural queries processing on graph-structured XML documents. We give each node some labels, the reachability labelling scheme. By extending an interval-based reachability labelling scheme for DAG by Rakesh et al., we design labelling schemes to support the judgements of reachability relationships for general graphs. Based on the labelling schemes, we design graph structural join algorithms to answer the structural queries with only ancestor-descendant relationship efficiently. For the processing of subgraph query, we design a subgraph join algorithm. With efficient data structure, the subgraph join algorithm can process subgraph queries with various structures efficiently. Experimental results show that our algorithms have good performance and scalability. Support by the Key Program of the National Natural Science Foundation of China under Grant No.60533110; the National Grand Fundamental Research 973 Program of China under Grant No. 2006CB303000; the National Natural Science Foundation of China under Grant No. 60773068 and No. 60773063.  相似文献   

17.
Complex parallel applications can often be modeled as directed acyclic graphs of coarse-grained application tasks with dependences. These applications exhibit both task and data parallelism, and combining these two (also called mixed parallelism) has been shown to be an effective model for their execution. In this paper, we present an algorithm to compute the appropriate mix of task and data parallelism required to minimize the parallel completion time (makespan) of these applications. In other words, our algorithm determines the set of tasks that should be run concurrently and the number of processors to be allocated to each task. The processor allocation and scheduling decisions are made in an integrated manner and are based on several factors such as the structure of the task graph, the runtime estimates and scalability characteristics of the tasks, and the intertask data communication volumes. A locality-conscious scheduling strategy is used to improve intertask data reuse. Evaluation through simulations and actual executions of task graphs derived from real applications and synthetic graphs shows that our algorithm consistently generates schedules with a lower makespan as compared to Critical Path Reduction (CPR) and Critical Path and Allocation (CPA), two previously proposed scheduling algorithms. Our algorithm also produces schedules that have a lower makespan than pure task- and data-parallel schedules. For task graphs with known optimal schedules or lower bounds on the makespan, our algorithm generates schedules that are closer to the optima than other scheduling approaches.  相似文献   

18.
On exploiting task duplication in parallel program scheduling   总被引:1,自引:0,他引:1  
One of the main obstacles in obtaining high performance from message-passing multicomputer systems is the inevitable communication overhead which is incurred when tasks executing on different processors exchange data. Given a task graph, duplication-based scheduling can mitigate this overhead by allocating some of the tasks redundantly on more than one processor. In this paper, we focus on the problem of using duplication in static scheduling of task graphs on parallel and distributed systems. We discuss five previously proposed algorithms and examine their merits and demerits. We describe some of the essential principles for exploiting duplication in a more useful manner and, based on these principles, propose an algorithm which outperforms the previous algorithms. The proposed algorithm generates optimal solutions for a number of task graphs. The algorithm assumes an unbounded number of processors. For scheduling on a bounded number of processors, we propose a second algorithm which controls the degree of duplication according to the number of available processors. The proposed algorithms are analytically and experimentally evaluated and are also compared with the previous algorithms  相似文献   

19.
基于多核处理器并行系统的任务调度算法   总被引:6,自引:0,他引:6  
针对多核处理器并行系统的特点,提出了相应的任务调度算法,该算法在任务调度之前加入了任务分配技术,通过合理的任务分配,可有效减少多个处理器间的通信开销,使任务调度效率更佳.仿真实现了该算法,并通过实验数据证明了该算法的优越性.  相似文献   

20.
Task graph pre-scheduling, using Nash equilibrium in game theory   总被引:1,自引:1,他引:0  
Prescheduling algorithms are targeted at restructuring of task graphs for optimal scheduling. Task graph scheduling is a NP-complete problem. This article offers a prescheduling algorithm for tasks to be executed on the networks of homogeneous processors. The proposed algorithm merges tasks to minimize their earliest start time while reducing the overall completion time. To this end, considering each task as a player attempting to reduce its earliest time as much as possible, we have applied the idea of Nash equilibrium in game theory to determine the most appropriate merging. Also, considering each level of a task graph as a player, seeking for distinct parallel processors to execute each of its independent tasks in parallel with the others, the idea of Nash equilibrium in game theory can be applied to determine the appropriate number of processors in a way that the overall idle time of the processors is minimized and the throughput is maximized. The communication delay will be explicitly considered in the comparisons. Our experiments with a number of known benchmarks task graphs and also two well-known problems of linear algebra, LU decomposition and Gauss–Jordan elimination, demonstrate the distinguished scheduling results provided by applying our algorithm. In our study, we consider ten scheduling algorithms: min–min, chaining, A ?, genetic algorithms, simulated annealing, tabu search, HLFET, ISH, DSH with task duplication, and our proposed algorithm (PSGT).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号