期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

曹洁曾国荪《计算机应用》2015,35(3):648-653

云环境中的处理机故障已成为云计算不可忽视的问题,容错成为设计和发展云计算系统的关键需求。针对一些容错调度算法在任务调度过程中调度效率低下以及任务类型单一的问题,提出一种处理机和任务主副版本分组的容错调度方法;并给出了副版本可重叠执行的判定方法,以及任务最坏响应时间的计算公式。通过实验和分析表明,和以前算法相比,将处理机分成两组分别执行任务主版本和任务副版本,减少了任务调度所需进行可调度测试的时间,增加了副版本重叠执行的机会,减少了所需的处理机个数,对提高系统处理机的利用率和容错调度的效率具有重要的意义。相似文献

2.

云环境中基于分解的多目标工作流调度算法

李克武张功萱朱昭萌《计算机工程与科学》2016,38(8):1588-1594

云服务提供商在给用户提供海量虚拟资源的同时,也面临着一个现实的问题,即怎样调度这些资源,以最小的代价(完工时间、执行费用、资源利用率等)完成工作流的执行。针对IaaS环境下的工作流调度问题,以完工时间和执行费用作为目标,提出了一种基于分解的多目标工作流调度算法。该算法结合了基于列表的启发式算法和多目标进化算法的选择过程,采用一种分解方法,将多目标优化问题分解为一组单目标优化子问题,然后同时求解这些单目标子问题,使得调度过程更为简单有效。算法利用天马项目发布的现实世界中的工作流进行实验,结果表明,和MOHEFT算法以及NSGA-II*算法相比较,所提出的算法能得到更优的Pareto解集,同时具有更低的时间复杂度。相似文献

3.

Reliability-aware scheduling strategy for heterogeneous distributed computing systems

Xiaoyong Tang Kenli Li Renfa Li Bharadwaj Veeravalli 《Journal of Parallel and Distributed Computing》2010

Heterogeneous computing systems are promising computing platforms, since single parallel architecture based systems may not be sufficient to exploit the available parallelism with the running applications. In some cases, heterogeneous distributed computing (HDC) systems can achieve higher performance with lower cost than single-machine supersystems. However, in HDC systems, processors and networks are not failure free and any kind of failure may be critical to the running applications. One way of dealing with such failures is to employ a reliable scheduling algorithm. Unfortunately, most existing scheduling algorithms for precedence constrained tasks in HDC systems do not adequately consider reliability requirements of inter-dependent tasks. In this paper, we design a reliability-driven scheduling architecture that can effectively measure system reliability, based on an optimal reliability communication path search algorithm, and then we introduce reliability priority rank (RRank) to estimate the task’s priority by considering reliability overheads. Furthermore, based on directed acyclic graph (DAG) we propose a reliability-aware scheduling algorithm for precedence constrained tasks, which can achieve high quality of reliability for applications. The comparison studies, based on both randomly generated graphs and the graphs of some real applications, show that our scheduling algorithm outperforms the existing scheduling algorithms in terms of makespan, scheduling length ratio, and reliability. At the same time, the improvement gained by our algorithm increases as the data communication among tasks increases. 相似文献

4.

A fault-tolerant dynamic scheduling algorithm for multiprocessorreal-time systems and its analysis

Manimaran G. Murthy C.S.R. 《Parallel and Distributed Systems, IEEE Transactions on》1998,9(11):1137-1152

Many time-critical applications require dynamic scheduling with predictable performance. Tasks corresponding to these applications have deadlines to be met despite the presence of faults. In this paper, we propose an algorithm to dynamically schedule arriving real-time tasks with resource and fault-tolerant requirements on to multiprocessor systems. The tasks are assumed to be nonpreemptable and each task has two copies (versions) which are mutually excluded in space, as well as in time in the schedule, to handle permanent processor failures and to obtain better performance, respectively. Our algorithm can tolerate more than one fault at a time, and employs performance improving techniques such as 1) distance concept which decides the relative position of the two copies of a task in the task queue, 2) flexible backup overloading, which introduces a trade-off between degree of fault tolerance and performance, and 3) resource reclaiming, which reclaims resources both from deallocated backups and early completing tasks. We quantify, through simulation studies, the effectiveness of each of these techniques in improving the guarantee ratio, which is defined as the percentage of total tasks, arrived in the system, whose deadlines are met. Also, we compare through simulation studies the performance our algorithm with a best known algorithm for the problem, and show analytically the importance of distance parameter in fault-tolerant dynamic scheduling in multiprocessor real-time systems 相似文献

5.

An efficient list-based task scheduling algorithm for heterogeneous distributed computing environment

Rajarethinam Madhura Vaidyanathan Rhymend Uthariaraj Benjamin Lydia Elizabeth 《Software》2023,53(2):390-412

The task scheduling in heterogeneous distributed computing systems plays a crucial role in reducing the makespan and maximizing resource utilization. The diverse nature of the devices in heterogeneous distributed computing systems intensifies the complexity of scheduling the tasks. To overcome this problem, a new list-based static task scheduling algorithm namely Deadline-Aware-Longest-Path-of-all-Predecessors (DA-LPP) is being proposed in this article. In the prioritization phase of the DA-LPP algorithm, the path length of the current task from all its predecessors at each level is computed and among them, the longest path length value is assigned as the rank of the task. This strategy emphasizes the tasks in the critical path. This well-optimized prioritization phase leads to an observable minimization in the makespan of the applications. In the processor selection phase, the DA-LPP algorithm implements the improved insertion-based policy which effectively utilizes the unoccupied leftover free time slots of the processors which improve resource utilization, further least computation cost allocation approach is followed to minimize the overall computation cost of the processors and parental prioritization policy is incorporated to further reduce the scheduling length. To demonstrate the robustness of the proposed algorithm, a synthetic graph generator is used in this experiment to generate a huge variety of graphs. Apart from the synthetic graphs, real-world application graphs like Montage, LIGO, Cybershake, and Epigenomic are also considered to grade the performance of the DA-LPP algorithm. Experimental results of the DA-LPP algorithm show improvement in performance in terms of scheduling length ratio, makespan reduction rate , and resource reduction rate when compared with other algorithms like DQWS, DUCO, DCO and EPRD. The results reveal that for 1000 task set with deadline equals to two times of the critical path, the scheduling length ratio of the DA-LPP algorithm is better than DQWS by 35%, DUCO by 23%, DCO by 26 %, and EPRD by 17%. 相似文献

6.

Configuration Reusing in On-Line Task Scheduling for Reconfigurable Computing Systems

下载免费PDF全文

Maisam Mansub Bassiri Hadi Shahriar Shahhoseini 《计算机科学技术学报》2011,26(3):463-473

Reconfigurable computing systems can be reconfigured at runtime and support partial reconfigurability which makes us able to execute tasks in a true multitasking manner.To manage such systems at runtime,a reconfigurable operating system is needed.The main part of this operating system is resource management unit which performs on-line scheduling and placement of hardware tasks at runtime.Reconfiguration overhead is an important obstacle that limits the performance of on-line scheduling algorithms in reconfigurable computing systems and increases the overall execution time.Configuration reusing (task reusing) can decrease reconfiguration overhead considerably,particularly in periodic applications or the applications in which the probability of tasks recurrence is high.In this paper,we present a technique called reusing-based scheduling (RBS),for on-line scheduling and placement in which configuration reusing is considered as a main characteristic in order to reduce reconfiguration overhead and decrease total execution time of the tasks.Several experiments have been conducted on the proposed algorithm.Obtained results show considerable improvement in overall execution time of the tasks. 相似文献

7.

实时异构系统中的积极复制容错调度算法

毛灯锋胡威刘静甘雨《计算机系统应用》2023,32(1):109-118

在设计实时异构系统中的容错调度算法时,既要考虑到实时性的约束,又要最大化系统的可靠性.此外,异构系统中的并行应用调度问题已经被证明了是NP完全问题.现有的容错调度算法大多采用复制技术来提升系统的可靠性,但是任务的多次执行会导致应用执行时间变长,系统实时性下降.为此,提出了一个基于积极复制技术的容错调度算法,该算法连续的复制任务集中对当前系统实时性影响最小的任务,然后将任务集中的所有任务调度至最早完成的处理器,用以在满足实时性约束的同时,提升系统的可靠性.实验表明,相比于同样着眼于实时异构系统的DB-FTSA算法,该算法在实时性约束严格的情况下,可靠性有较大提升. 相似文献

8.

An intelligent water drops-based workflow scheduling for IaaS cloud

《Applied Soft Computing》2019

Cloud computing is an emerging technology in a distributed environment with a collection of large-scale heterogeneous systems. One of the challenging issues in the cloud data center is to select the minimum number of virtual machine (VM) instances to execute the tasks of a workflow within a time limit. The objectives of such a strategy are to minimize the total execution time of a workflow and improve resource utilization. However, the existing algorithms do not guarantee to achieve high resource utilization although they have abilities to achieve high execution efficiency. The higher resource utilization depends on the reusability of VM instances. In this work, we propose a new intelligent water drops based workflow scheduling algorithm for Infrastructure-as-a-Service (IaaS) cloud. The objectives of the proposed algorithm are to achieve higher resource utilization and minimize the makespan within the given deadline and budget constraints. The first contribution of the algorithm is to find multiple partial critical paths (PCPs) of a workflow which helps in finding suitable VM instances. The second contribution is a scheduling strategy for PCP-VM assignment for assigning the VM instances. The proposed algorithm is evaluated through various simulation runs using synthetic datasets and various performance metrics. Through comparison, we show the superior performance of the proposed algorithm over the existing ones. 相似文献

9.

分布式控制系统中多种混合任务的容错调度

孙邵海刘怀《计算机工程与应用》2007,43(2):81-85,113

分布式控制系统中存在有强实时、软实时和非实时等多种实时性的任务,其中强实时任务必须在其时限前完成,否则会出现灾难性后果,因此必须为分布式控制系统提供一定的容错能力。首先给出了用于调度多种实时性任务的单处理器调度算法——双优先级队列调度算法,并分析算法的可调度性条件。针对分布式控制系统,考虑基版本与副版本的执行时间不同时,结合版本复制技术和单处理器调度算法提出了一种新的容错调度算法。分析了算法的可调度行,给出了可任务集的可调度条件判断方法和基版本任务时限的设置方法。在此基础上,采用启发式静态任务分配算法,保证各处理器的负载均衡。本算法在保证任务容错可调度的条件下,可提高系统中各处理器的利用率,仿真结果表明该算法是有效的。相似文献

10.

An Integrated Approach to Locality-Conscious Processor Allocation and Scheduling of Mixed-Parallel Applications

Vydyanathan Naga Krishnamoorthy Sriram Sabin Gerald M. Catalyurek Umit V. Kurc Tahsin Sadayappan Ponnuswamy Saltz Joel H. 《Parallel and Distributed Systems, IEEE Transactions on》2009,20(8):1158-1172

Complex parallel applications can often be modeled as directed acyclic graphs of coarse-grained application tasks with dependences. These applications exhibit both task and data parallelism, and combining these two (also called mixed parallelism) has been shown to be an effective model for their execution. In this paper, we present an algorithm to compute the appropriate mix of task and data parallelism required to minimize the parallel completion time (makespan) of these applications. In other words, our algorithm determines the set of tasks that should be run concurrently and the number of processors to be allocated to each task. The processor allocation and scheduling decisions are made in an integrated manner and are based on several factors such as the structure of the task graph, the runtime estimates and scalability characteristics of the tasks, and the intertask data communication volumes. A locality-conscious scheduling strategy is used to improve intertask data reuse. Evaluation through simulations and actual executions of task graphs derived from real applications and synthetic graphs shows that our algorithm consistently generates schedules with a lower makespan as compared to Critical Path Reduction (CPR) and Critical Path and Allocation (CPA), two previously proposed scheduling algorithms. Our algorithm also produces schedules that have a lower makespan than pure task- and data-parallel schedules. For task graphs with known optimal schedules or lower bounds on the makespan, our algorithm generates schedules that are closer to the optima than other scheduling approaches. 相似文献

11.

容错实时任务调度的DSPN建模与分析

下载免费PDF全文

周悦王勋郭威《计算机测量与控制》2017,25(1):107-110

复杂系统的形式化描述对新系统的设计以及现有系统的改进与评价都具有十分重要的作用;针对处理机系统容错实时混合任务调度,提出采用确定与随机Petri网进行建模与性能分析;首先,根据任务执行的优先级、周期性、容错性和实时性,将任务分为四类;然后,采用DSPN对任务调度执行过程,不同优先级任务抢占式调度,处理机故障及故障恢复过程进行建模,由此构成处理机系统容错实时任务调度过程的DSPN模型;最后,仿真实验结果表明,在负载相同情况下,处理机利用率基本相同,且具有容错的实时任务调度算法可以有效地降低任务错失率;容错实时任务调度DSPN模型可以为复杂任务调度系统的Petri网建模与分析奠定了基础,并为实际工程应用提供了理论指导。相似文献

12.

A design fix to supervisory control for fault-tolerant scheduling of real-time multiprocessor systems with aperiodic tasks

Rajesh Devaraj Arnab Sarkar 《International journal of control》2013,86(11):2211-2216

In the article ‘Supervisory control for fault-tolerant scheduling of real-time multiprocessor systems with aperiodic tasks’, Park and Cho presented a systematic way of computing a largest fault-tolerant and schedulable language that provides information on whether the scheduler (i.e., supervisor) should accept or reject a newly arrived aperiodic task. The computation of such a language is mainly dependent on the task execution model presented in their paper. However, the task execution model is unable to capture the situation when the fault of a processor occurs even before the task has arrived. Consequently, a task execution model that does not capture this fact may possibly be assigned for execution on a faulty processor. This problem has been illustrated with an appropriate example. Then, the task execution model of Park and Cho has been modified to strengthen the requirement that none of the tasks are assigned for execution on a faulty processor. 相似文献

13.

Reliability versus performance for critical applications

Alain Girault Érik Saule Denis Trystram 《Journal of Parallel and Distributed Computing》2009

Applications implemented on critical systems are subject to both safety critical and real-time constraints. Classically, applications are specified as precedence task graphs that must be scheduled onto a given target multiprocessor heterogeneous architecture. We propose a new method for simultaneously optimizing two objectives: the execution time and the reliability of the schedule. The problem is decomposed into two successive steps: a spatial allocation during which the reliability is maximized (randomized algorithm), and a scheduling during which the makespan is minimized (list scheduling algorithm). It allows us to produce several trade-off solutions, among which the user can choose the solution that best fits the application’s requirements. Reliability is increased by replicating adequate tasks onto well chosen processors. Our fault model assumes that processors are fail-silent, that they are subject to transient failures, and that the occurrences of failures follow a constant parameter Poisson law. We assess and validate our method by running extensive simulations on both random graphs and actual application graphs. They show that it is competitive, in terms of makespan, compared to existing reference scheduling methods for heterogeneous processors (HEFT), while providing a better reliability. 相似文献

14.

A resource-aware scheduling algorithm with reduced task duplication on heterogeneous computing systems

Jing Mei Kenli Li Keqin Li 《The Journal of supercomputing》2014,68(3):1347-1377

To satisfy the high-performance requirements of application executions, many kinds of task scheduling algorithms have been proposed. Among them, duplication-based scheduling algorithms achieve higher performance compared to others. However, because of their greedy feature, they duplicate parents of each task as long as the finish time can be reduced, which leads to a superfluous consumption of resource. However, a large amount of duplications are unnecessary because slight delay of some uncritical tasks does not affect the overall makespan. Moreover, these redundant duplications would occupy the resources, delay the execution of subsequent tasks, and increase the schedule makespan consequently. In this paper, we propose a novel duplication-based algorithm designed to overcome the above drawbacks. The proposed algorithm is to schedule tasks with the least redundant duplications. An optimizing scheme is introduced to search and remove redundancy for a schedule generated by the proposed algorithm further. Randomly generated directed acyclic graphs and two real-world applications are tested in our experiments. Experimental results show that the proposed algorithm can save up to 15.59 % resource consumption compared with the other algorithms. The makespan has improvement as well. 相似文献

15.

Reputation-based dependable scheduling of workflow applications in Peer-to-Peer Grids

Mustafizur Rahman Rajiv Ranjan Rajkumar Buyya 《Computer Networks》2010,54(18):3341-3359

Grids facilitate creation of wide-area collaborative environment for sharing computing or storage resources and various applications. Inter-connecting distributed Grid sites through peer-to-peer routing and information dissemination structure (also known as Peer-to-Peer Grids) is essential to avoid the problems of scheduling efficiency bottleneck and single point of failure in the centralized or hierarchical scheduling approaches. On the other hand, uncertainty and unreliability are facts in distributed infrastructures such as Peer-to-Peer Grids, which are triggered by multiple factors including scale, dynamism, failures, and incomplete global knowledge.In this paper, a reputation-based Grid workflow scheduling technique is proposed to counter the effect of inherent unreliability and temporal characteristics of computing resources in large scale, decentralized Peer-to-Peer Grid environments. The proposed approach builds upon structured peer-to-peer indexing and networking techniques to create a scalable wide-area overlay of Grid sites for supporting dependable scheduling of applications. The scheduling algorithm considers reliability of a Grid resource as a statistical property, which is globally computed in the decentralized Grid overlay based on dynamic feedbacks or reputation scores assigned by individual service consumers mediated via Grid resource brokers. The proposed algorithm dynamically adapts to changing resource conditions and offers significant performance gains as compared to traditional approaches in the event of unsuccessful job execution or resource failure. The results evaluated through an extensive trace driven simulation show that our scheduling technique can reduce the makespan up to 50% and successfully isolate the failure-prone resources from the system. 相似文献

16.

DAG分割模型下的云工作流调度策略

薛凡《计算机应用研究》2019,36(12)

为了优化云工作流调度的经济代价和执行效率,提出一种基于有向无循环图（DAG）分割的工作流调度算法PBWS。以工作流调度效率与代价同步优化为目标,算法将调度求解过程划分为三个阶段进行：工作流DAG结构分割、分割结构调整及资源分配。工作流DAG结构分割阶段在确保任务间执行顺序依赖的同时求解初始的任务分割图;分割结构调整阶段以降低执行跨度为目标,在不同分割间对任务进行重分配;资源分配阶段旨在选择代价最高效的任务与资源映射关系,确保资源的总空闲时间最小。利用五种科学工作流DAG模型对算法进行了仿真实验。结果表明。PBWS算法仅以较小的执行跨度为开销,极大降低了工作流执行代价,实现了调度效率与调度代价的同步优化,其综合性能是优于同类型算法的。相似文献

17.

Fault-tolerant adaptive scheduling for embedded real-time systems

Richardson P. Sieh L. Elkateeb A.M. 《Micro, IEEE》2001,21(5):41-51

Describes a fault-tolerant algorithm which uses a time-value scheduling approach to detect faults, sustain high processor utilization, and ensure timely execution of critical tasks 相似文献

18.

单机实时系统中的一种混合型实时容错调度算法

但孝棉沈元隆《数字社区&智能家居》2005,(9):37-39

目前研究单机实时系统的调度算法文章大多只能调度单一类型的任务。本文在PKSA算法的基础上，建立了一种混合型实时容错模型，提出一种调度算法不仅可以调度有容错需求的周期任务，同时也能够调度无容错需求的周期任务和非周期非实时任务，实现了调度混合型任务的目的。相似文献

19.

Dynamic scheduling of task graphs in multi-FPGA systems using critical path

Ramezani Reza 《The Journal of supercomputing》2021,77(1):597-618

SRAM-based FPGAs feature high performance and flexibility. Thus, they have found many applications in modern high-performance computing (HPC) systems. These systems suffer from the limitation of the computing resources problem for running HPC applications. Therefore, multi-FPGA systems have been emerged to alleviate such resource limitations. In this regard, efficient scheduling strategies are required to dynamically steer the execution of applications—represented as task graphs—on a set of connected FPGAs. In this paper, a heuristic-based dynamic critical path-aware scheduling technique named CPA is presented to schedule task graphs on multi-FPGA systems. The proposed technique, by considering the computation and communication capabilities of FPGAs, dynamically assigns priority to tasks in different steps in order to achieve better makespans. The proposed technique has been evaluated by conducting several experiments on real-world and three different shapes of random task graphs with different number of tasks, and its efficiency has been compared with that of three task graph scheduling approaches. The obtained results demonstrate that the proposed CPA technique outperforms well-known heuristic scheduling strategies and improves their makespan by 13.47% on average. In addition, the experiments show that the proposed technique generates the schedules in the order of milliseconds and the average of its yielded makespans is 12.05% longer than that of an optimum schedule.

相似文献

20.

基于软件容错的动态实时调度算法

韩建军李庆华 Abbas A.Essa 《小型微型计算机系统》2005,26(4):658-661

在硬实时系统中，由于任务超时完成将会导致灾难性后果，因而硬实时系统具有严格的时间及可靠性限制条件．目前实时容错调度算法大多针对硬件的容错，很少考虑软件运行的故障．提出了一种类似EDF的软件容错的动态实时调度算法PKSA(Probng-step Algorithm)，本算法在任务执行过程中，通过若干试探性检测步骤，提高了任务可执行性的预测，尽可能地避免了任务早期的失败对后续任务的影响，因此提高了任务的完成率，并同时有效地减少了浪费的CPU时间片．通过实验测试．同目前所知的同类算法相比，具有更佳的调度性能-调度成本比. 相似文献