首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
高性能计算机体系结构的复杂性对使用者提出了更高要求;而且在工程实际和科学实验中,通常需要使用多种应用软件相互协作才能解决复杂问题。围绕超算资源的易用性和多类软件的集成以及协作需求,开发了超算环境下的科学工作流应用平台,设计了异步并发的流程执行引擎,采取调度算法和调度器、引擎相分离的设计策略,给出了资源调度方案。提出了局部资源池化技术和资源预约算法,并比较分析了五种常用调度算法的性能,给出了算法选择的建议。实际应用表明设计的引擎能够支撑复杂工作流的灵活执行方式,给出的资源调度方案能够满足超算环境下工作流应用的高效执行。  相似文献   

2.
High performance computing (HPC) systems allow researchers and businesses to harness large amounts of computing power needed for solving complex problems. In such systems a job scheduler prioritizes the execution of jobs belonging to users of the system in a manner that allows the system to satisfy performance objectives for various groups of users while simultaneously making efficient use of available resources. Typically, system administrators have the responsibility of manually configuring or tuning the job scheduler such that the performance objectives of user groups as well as system‐level performance objectives are met. Modern job schedulers used in production systems are quite complex. Through detailed trace‐driven simulations, we show that manually tuning the configuration of production schedulers in an environment characterized by multiple performance objectives is very challenging and may not be feasible. To alleviate this problem, this paper describes a toolset that can help a system administrator to automatically configure a scheduler such that the performance objectives for various classes of users in the system as well as other system‐level performance objectives can be satisfied. A unique aspect of this work that differentiates it from the existing work on scheduler tuning is that it has been implemented to work with a widely used production scheduler. Furthermore, in contrast to the existing work it considers the challenging real‐world problem of delivering different levels of performance to different classes of users. System administrators can exploit the toolset to react quickly to changes in performance objectives and workload conditions. Case studies using synthetic and real HPC workloads demonstrate the effectiveness of the technique. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

3.
基于OGSA网格的分层式网格任务调度器设计   总被引:1,自引:0,他引:1  
文章根据网格任务调度的需求、网格任务调度的特点,在充分分析一般网格任务调度的过程等的基础上,另外考虑到了网格计算环境的一些特点,比如虚拟化、分层次及自治的本质特征,以及在工作流任务协同需求下网格任务的资源依赖、粗粒度、重复执行等特性的前提下,改进设计了一种网格工作流任务主从式分层调度模型,并给出了调度策略和调度算法实现。该调度器模型在实际的网格工作流任务协同系统中得到了较好的应用效果。  相似文献   

4.
为了能有效处理海量数据,进行关联分析、商业预测等,Hadoop分布式云计算平台应运而生。但随着Hadoop的广泛应用,其作业调度方面的不足也显现出来,现有的多种作业调度器存在参数设置复杂、启动时间长等缺陷。借助于人工蜂群算法的自组织性强、收敛速度快的优势,设计并实现了能实时检测Hadoop内部资源使用情况的资源感知调度器。相比于原有的作业调度器,该调度器具有参数设置少、启动速度快等优势。基准测试结果表明,该调度器在异构集群上,调度资源密集型作业比原有调度器快10%~20%左右。  相似文献   

5.
Efficient data-aware methods in job scheduling, distributed storage management and data management platforms are necessary for successful execution of data-intensive applications. However, research about methods for data-intensive scientific applications are insufficient in large-scale distributed cloud and cluster computing environments and data-aware methods are becoming more complex. In this paper, we propose a Data-Locality Aware Workflow Scheduling (D-LAWS) technique and a locality-aware resource management method for data-intensive scientific workflows in HPC cloud environments. D-LAWS applies data-locality and data transfer time based on network bandwidth to scientific workflow task scheduling and balances resource utilization and parallelism of tasks at the node-level. Our method consolidates VMs and consider task parallelism by data flow during the planning of task executions of a data-intensive scientific workflow. We additionally consider more complex workflow models and data locality pertaining to the placement and transfer of data prior to task executions. We implement and validate the methods based on fairness in cloud environments. Experimental results show that, the proposed methods can improve performance and data-locality of data-intensive workflows in cloud environments.  相似文献   

6.
在大规模的Hadoop集群中,良好的任务调度策略对提高数据本地性、减小网络传输开销、减少作业执行时间以及提高集群的作业吞吐量都有着重要的影响。本文针对Hadoop架构中Reduce任务的数据本地性较低问题,提出了一种基于延迟调度策略的Reduce任务调度优化算法,通过提高Reduce任务的数据本地性来减少作业执行时间以及提高作业吞吐量,该算法在Hadoop架构的Early Shuffle阶段,使用多级延迟调度策略来提高Reduce任务的数据本地性。最后重写原生公平调度器代码实现了该调度算法,并与原生公平调度器进行了对比实验分析,实验结果表明该算法明显减少了作业执行时间,提高了集群的作业吞吐量。  相似文献   

7.
This paper describes a PVM task scheduler designed and implemented by the authors.The scheduler supports selecting idle workstations,scheduling pool tasks and dynamically produced subtasks.It can improve resource utilization,reduce job response time and simplify programming.  相似文献   

8.
调度PVM任务   总被引:8,自引:0,他引:8  
鞠九滨  王勇 《计算机学报》1997,20(5):470-474
本文介绍了一个自行设计和实现的PVM任务调度系统,可进行空间机选择,对任务池和动态生成的子任务进行调度,提高处理机资源利用率,改进作业响应时间和简化用户编程过程。  相似文献   

9.
The workflow scheduling problem has drawn a lot of attention in the research community. This paper presents a workflow scheduling algorithm, called granularity score scheduling (GSS), which is based on the granularity of the tasks in a given workflow. The main objectives of GSS are to minimize the makespan and maximize the average virtual machine utilization. The algorithm consists of three phases, namely B-level calculation, score adjustment and task ranking and scheduling. We simulate the proposed algorithm using various benchmark scientific workflow applications, i.e., Cybershake, Epigenomic, Inspiral and Montage. The simulation results are compared with two well-known existing workflow scheduling algorithms, namely heterogeneous earliest finish time and performance effective task scheduling, which are also applied in cloud computing environment. Based on the simulation results, the proposed algorithm remarkably demonstrates its performance in terms of makespan and average virtual machine utilization.  相似文献   

10.
范菁  沈杰  熊丽荣 《计算机科学》2015,42(Z11):400-405
混合云环境下调度包含敏感数据的工作流主要考虑在满足数据安全性以及工作流截止时间的前提下,对工作流任务在混合云上进行分配,实现计算资源与任务的映射,并优化调度费用。采用了整数规划来建模求解包含数据敏感性、截止时间和调度费用3种约束条件的混合云工作流调度问题,同时为优化模型求解速度,基于“帕雷托最优”原理对工作流任务在混合云上的分配方案进行筛选以减小模型求解规模。实验表明,优先排除不合理的任务分配方案可有效减小整数规划模型的求解规模,缩短模型计算时间,在产生较小误差的情况下获得较优的调度结果。  相似文献   

11.
Cluster scheduling, where processors are grouped into clusters and the tasks that are allocated to one cluster are scheduled by a global scheduler, has attracted attention in multiprocessor real-time systems research recently. In this paper, assuming that an optimal global scheduler is adopted within each cluster, we investigate the worst-case utilization bounds for cluster scheduling with different task allocation/partitioning heuristics. First, we develop a lower limit on the utilization bounds for cluster scheduling with any reasonable task allocation scheme. Then, the lower limit is shown to be the exact utilization bound for cluster scheduling with the worst-fit task allocation scheme. For other task allocation heuristics (such as first-fit, best-fit, first-fit decreasing, best-fit decreasing and worst-fit decreasing), higher utilization bounds are derived for systems with both homogeneous clusters (where each cluster has the same number of processors) and heterogeneous clusters (where clusters have different number of processors). In addition, focusing on an efficient optimal global scheduler, namely the boundary-fair (Bfair) algorithm, we propose a period-aware task allocation heuristic with the goal of reducing the scheduling overhead (e.g., the number of scheduling points, context switches and task migrations). Simulation results indicate that the percentage of task sets that can be scheduled is significantly improved under cluster scheduling even for small-size clusters, compared to that of the partitioned scheduling. Moreover, when comparing to the simple generic task allocation scheme (e.g., first-fit), the proposed period-aware task allocation heuristic markedly reduces the scheduling overhead of cluster scheduling with the Bfair scheduler.  相似文献   

12.
同构Hadoop集群环境下改进的延迟调度算法   总被引:1,自引:1,他引:0  
在Hadoop框架下计算资源和数据资源可以在不同物理位置的特点产生本地化问题。延迟调度算法的产生旨在解决本地化问题, 此算法根据任务待处理数据的物理位置作为作业的计算节点, 调度任务至目标节点。但是可能出现同一作业中若干任务集中运行在某一计算节点, 导致作业达不到理想的并行效果。针对原有的延迟调度算法, 提出延迟一容量调度算法, 允许部分任务选择非本地化节点作为原延迟调度算法中任务的目标计算节点, 以提高作业的响应时间与增加作业的并行程度。最后通过实验对比分析, 改进后的算法在执行效率和并行效果明显优于原延迟调度算法。  相似文献   

13.
Admission Control with Immediate Notification   总被引:1,自引:0,他引:1  
When admission control is used, an on-line scheduler chooses whether or not to complete each individual job successfully by its deadline. An important consideration is at what point in time the scheduler determines if a job request will be satisfied, and thus at what point the scheduler is able to provide notification to the job owner as to the fate of the request. In the loosest model, often seen in real-time systems, such a decision can be deferred up until the job's deadline passes. In the strictest model, more suitable for customer-based applications, a scheduler would be required to give notification at the instant that a job request arrives.Unfortunately there seems to be little existing research which explicitly studies the effect of the notification model on the performance guarantees of a scheduler. We undertake such a study by reexamining a problem from the literature. Specifically, we study the effect of the notification model on the non-preemptive scheduling of a single resource in order to maximize utilization. At first glance, it appears severely more restrictive to compare a scheduler required to give immediate notification to one which need not give any notification. Yet we are able to present alternate algorithms which provide immediate notification, while matching most of the performance guarantees which are possible by schedulers which provide no such notification. In only one case are we able to give evidence that providing immediate notification may be more difficult.  相似文献   

14.
Object-based parallel file systems have emerged as promising storage solutions for high-performance computing (HPC) systems. Despite the fact that object storage provides a flexible interface, scheduling highly concurrent I/O requests that access a large number of objects still remains as a challenging problem, especially in the case when stragglers (storage servers that are significantly slower than others) exist in the system. An efficient I/O scheduler needs to avoid possible stragglers to achieve low latency and high throughput. In this paper, we introduce a log-assisted straggler-aware I/O scheduling to mitigate the impact of storage server stragglers. The contribution of this study is threefold. First, we introduce a client-side, log-assisted, straggler-aware I/O scheduler architecture to tackle the storage straggler issue in HPC systems. Second, we present three scheduling algorithms that can make efficient decision for scheduling I/Os while avoiding stragglers based on such an architecture. Third, we evaluate the proposed I/O scheduler using simulations, and the simulation results have confirmed the promise of the newly introduced straggler-aware I/O scheduler.  相似文献   

15.
Resource provisioning and scheduling are crucial for cloud workflow applications. Simulation is one of the most promising evaluation methods for different resource provisioning and scheduling algorithms. However, existing simulators for Cloud workflow applications fail to provide support for resource runtime auto-scaling and stochastic task execution time modeling. In this paper, a workflow simulator ElasticSim is introduced, which is an extension of the popular used CloudSim simulator by adding support for resource runtime auto-scaling and stochastic task execution time modeling. Most of existing workflow scheduling algorithms are static and are based on deterministic task execution times. By the aid of ElasticSim, the practical performance of existing static algorithms, when they are put into practice with stochastic task execution times, is evaluated. Experimental results show that about 2.8 % to 20 % additional resource rental cost is incurred for different cases and workflow deadlines are violated for most cases because of stochastic task execution times. Therefore, ElasticSim is a promising platform for evaluating the practical performance of workflow resource provisioning and scheduling algorithms, which supports resource runtime auto-scaling and stochastic task execution time modeling.  相似文献   

16.
提出了计算资源共享平台中具有时间约束的工作流任务调度方法,该方法利用了非集中式的树型应用层覆盖网络拓扑结构,从而可以高效而快速的收集资源的可用信息。采用全局调度器与本地调度器结合的方式,通过定义资源的收集功能过程,使每个节点中的本地调度器能够把自身的资源可用信息提供给全局的调度器,工作流中任务的最后期限时间约束和任务的恢复时间以一种时间间隙的机制来完成。仿真结果表明,分治模式和解方程类的迭代模式的工作流任务能够在平台上成功调度运行,具有比较快的响应时间和低的通信负载。  相似文献   

17.
In the last years, scientific workflows have emerged as a fundamental abstraction for structuring and executing scientific experiments in computational environments. Scientific workflows are becoming increasingly complex and more demanding in terms of computational resources, thus requiring the usage of parallel techniques and high performance computing (HPC) environments. Meanwhile, clouds have emerged as a new paradigm where resources are virtualized and provided on demand. By using clouds, scientists have expanded beyond single parallel computers to hundreds or even thousands of virtual machines. Although the initial focus of clouds was to provide high throughput computing, clouds are already being used to provide an HPC environment where elastic resources can be instantiated on demand during the course of a scientific workflow. However, this model also raises many open, yet important, challenges such as scheduling workflow activities. Scheduling parallel scientific workflows in the cloud is a very complex task since we have to take into account many different criteria and to explore the elasticity characteristic for optimizing workflow execution. In this paper, we introduce an adaptive scheduling heuristic for parallel execution of scientific workflows in the cloud that is based on three criteria: total execution time (makespan), reliability and financial cost. Besides scheduling workflow activities based on a 3-objective cost model, this approach also scales resources up and down according to the restrictions imposed by scientists before workflow execution. This tuning is based on provenance data captured and queried at runtime. We conducted a thorough validation of our approach using a real bioinformatics workflow. The experiments were performed in SciCumulus, a cloud workflow engine for managing scientific workflow execution.  相似文献   

18.
Multiprocessor scheduling in a shared multiprogramming environment can be structured in two levels, where a kernel-level job scheduler allots processors to jobs and a user-level thread scheduler maps the ready threads of a job onto the allotted processors. We present two provably-efficient two-level scheduling schemes called G-RAD and S-RAD respectively. Both schemes use the same job scheduler RAD for the processor allotments that ensures fair allocation under all levels of workload. In G-RAD, RAD is combined with a greedy thread scheduler suitable for centralized scheduling; in S-RAD, RAD is combined with a work-stealing thread scheduler more suitable for distributed settings. Both G-RAD and S-RAD are non-clairvoyant. Moreover, they provide effective control over the scheduling overhead and ensure efficient utilization of processors. We also analyze the competitiveness of both G-RAD and S-RAD with respect to an optimal clairvoyant scheduler. In terms of makespan, both schemes can achieve O(1)-competitiveness for any set of jobs with arbitrary release time. In terms of mean response time, both schemes are O(1)-competitive for arbitrary batched jobs. To the best of our knowledge, G-RAD and S-RAD are the first non-clairvoyant scheduling algorithms that guarantee provable efficiency, fairness and minimal overhead.  相似文献   

19.
A new approach for dynamic job scheduling in mesh-connected multiprocessor systems, which supports a multiuser environment, is proposed in this paper. Our approach combines a submesh reservation policy with a priority-based scheduling policy to obtain high performance in terms of high throughput, high utilization, and low turn-around times for jobs. This high performance is achieved at the expense of scheduling jobs in a strictly fair, FCFS fashion; in fact, the algorithm is parameterized to allow trade-offs between performance and (short-term) POPS fairness. The proposed scheduler can be used with any submesh allocation policy. A fast and efficient implementation of the proposed scheduler has also been presented. The performance of the proposed scheme has been compared with the FCFS policy, the only existing scheduling strategy for meshes, to demonstrate the effectiveness of the proposed approach. Simulation results indicate that our scheduling strategy outperforms the FCFS policy significantly. Specifically, our strategy significantly reduces the average waiting delay of jobs over the FCFS policy. The fast implementation of the proposed scheduler results in low allocation and deallocation time overhead, as well as low space overhead  相似文献   

20.
DAGMap: efficient and dependable scheduling of DAG workflow job in Grid   总被引:1,自引:1,他引:0  
DAG has been extensively used in Grid workflow modeling. Since Grid resources tend to be heterogeneous and dynamic, efficient and dependable workflow job scheduling becomes essential. It poses great challenges to achieve minimum job accomplishing time and high resource utilization efficiency, while providing fault tolerance. Based on list scheduling and group scheduling, in this paper, we propose a novel scheduling heuristic called DAGMap. DAGMap consists of two phases, namely Static Mapping and Dependable Execution. Four salient features of DAGMap are: (1) Task grouping is based on dependency relationships and task upward priority; (2) Critical tasks are scheduled first; (3) Min-Min and Max-Min selective scheduling are used for independent tasks; and (4) Checkpoint server with cooperative checkpointing is designed for dependable execution. The experimental results show that DAGMap can achieve better performance than other previous algorithms in terms of speedup, efficiency, and dependability.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号