首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In the last years, scientific workflows have emerged as a fundamental abstraction for structuring and executing scientific experiments in computational environments. Scientific workflows are becoming increasingly complex and more demanding in terms of computational resources, thus requiring the usage of parallel techniques and high performance computing (HPC) environments. Meanwhile, clouds have emerged as a new paradigm where resources are virtualized and provided on demand. By using clouds, scientists have expanded beyond single parallel computers to hundreds or even thousands of virtual machines. Although the initial focus of clouds was to provide high throughput computing, clouds are already being used to provide an HPC environment where elastic resources can be instantiated on demand during the course of a scientific workflow. However, this model also raises many open, yet important, challenges such as scheduling workflow activities. Scheduling parallel scientific workflows in the cloud is a very complex task since we have to take into account many different criteria and to explore the elasticity characteristic for optimizing workflow execution. In this paper, we introduce an adaptive scheduling heuristic for parallel execution of scientific workflows in the cloud that is based on three criteria: total execution time (makespan), reliability and financial cost. Besides scheduling workflow activities based on a 3-objective cost model, this approach also scales resources up and down according to the restrictions imposed by scientists before workflow execution. This tuning is based on provenance data captured and queried at runtime. We conducted a thorough validation of our approach using a real bioinformatics workflow. The experiments were performed in SciCumulus, a cloud workflow engine for managing scientific workflow execution.  相似文献   

2.
Cloud computing, an important source of computing power for the scientific community, requires enhanced tools for an efficient use of resources. Current solutions for workflows execution lack frameworks to deeply analyze applications and consider realistic execution times as well as computation costs. In this study, we propose cloud user–provider affiliation (CUPA) to guide workflow’s owners in identifying the required tools to have his/her application running. Additionally, we develop PSO-DS, a specialized scheduling algorithm based on particle swarm optimization. CUPA encompasses the interaction of cloud resources, workflow manager system and scheduling algorithm. Its featured scheduler PSO-DS is capable of converging strategic tasks distribution among resources to efficiently optimize makespan and monetary cost. We compared PSO-DS performance against four well-known scientific workflow schedulers. In a test bed based on VMware vSphere, schedulers mapped five up-to-date benchmarks representing different scientific areas. PSO-DS proved its efficiency by reducing makespan and monetary cost of tested workflows by 75 and 78%, respectively, when compared with other algorithms. CUPA, with the featured PSO-DS, opens the path to develop a full system in which scientific cloud users can run their computationally expensive experiments.  相似文献   

3.
The workflow paradigm has become the standard to represent processes and their execution flows. With the evolution of e-Science, workflows are becoming larger and more computational demanding. Such e-Science necessities match with what computational Grids have to offer. Grids are shared distributed platforms which will eventually receive multiple requisitions to execute workflows. With this, there is a demand for a scheduler which deals with multiple workflows in the same set of resources, thus the development of multiple workflow scheduling algorithms is necessary. In this paper we describe four different initial strategies for scheduling multiple workflows on Grids and evaluate them in terms of schedule length and fairness. We present results for the initial schedule and for the makespan after the execution with external load. From the results we conclude that interleaving the workflows on the Grid leads to good average makespan and provides fairness when multiple workflows share the same set of resources.  相似文献   

4.
Security is increasingly critical for various scientific workflows that are big data applications and typically take quite amount of time being executed on large-scale distributed infrastructures. Cloud computing platform is such an infrastructure that can enable dynamic resource scaling on demand. Nevertheless, based on pay-per-use and hourly-based pricing model, users should pay attention to the cost incurred by renting virtual machines (VMs) from cloud data centers. Meanwhile, workflow tasks are generally heterogeneous and require different instance series (i.e., computing optimized, memory optimized, storage optimized, etc.). In this paper, we propose a security and cost aware scheduling (SCAS) algorithm for heterogeneous tasks of scientific workflow in clouds. Our proposed algorithm is based on the meta-heuristic optimization technique, particle swarm optimization (PSO), the coding strategy of which is devised to minimize the total workflow execution cost while meeting the deadline and risk rate constraints. Extensive experiments using three real-world scientific workflow applications, as well as CloudSim simulation framework, demonstrate the effectiveness and practicality of our algorithm.  相似文献   

5.
Bag-of-Tasks (BoT) workflows are widespread in many big data analysis fields. However, there are very few cloud resource provisioning and scheduling algorithms tailored for BoT workflows. Furthermore, existing algorithms fail to consider the stochastic task execution times of BoT workflows which leads to deadline violations and increased resource renting costs. In this paper, we propose a dynamic cloud resource provisioning and scheduling algorithm which aims to fulfill the workflow deadline by using the sum of task execution time expectation and standard deviation to estimate real task execution times. A bag-based delay scheduling strategy and a single-type based virtual machine interval renting method are presented to decrease the resource renting cost. The proposed algorithm is evaluated using a cloud simulator ElasticSim which is extended from CloudSim. The results show that the dynamic algorithm decreases the resource renting cost while guaranteeing the workflow deadline compared to the existing algorithms.  相似文献   

6.
Optimizing cloud provisioning for scientific workflow applications is a challenging problem, since the workflows generally contain dependency between tasks and require specific deadlines. Usually, cloud providers offer many options to the consumers. These options include the number of virtual machines, the type of each virtual machine and the purchasing method for each machine. Currently, cloud provisioning cost optimization is an active research topic. Most of this literature is concerned with task scheduling, cloud option selection, and cloud option selection for scientific workflow applications. However, research that attempts to find solutions which cover both cloud option selection and workflow task scheduling is very limited. In this paper, we focus on optimizing the cost of purchasing infrastructure-as-a-service cloud capabilities to achieve scientific work flow execution within the specific deadlines. The proposed system considers the number of purchased instances, instance types, purchasing options, and task scheduling as constraints in an optimization process. Particle swarm optimization augmented with a variable neighborhood search technique is used to find the optimal solution. Our approach finds the configurations of purchasing options with the optimum budget for a specified workflow application based on the required performance. The solutions from the proposed system show promising performance from the perspectives of the total cost and fitness convergence when compared with other state-of-the-art algorithms.  相似文献   

7.
The emergence of Cloud Computing as a model of service provisioning in distributed systems instigated researchers to explore its pros and cons on executing different large scale scientific applications, i.e., Workflows. One of the most challenging problems in clouds is to execute workflows while minimizing the execution time as well as cost incurred by using a set of heterogeneous resources over the cloud simultaneously. In this paper, we present, Budget and Deadline Constrained Heuristic based upon Heterogeneous Earliest Finish Time (HEFT) to schedule workflow tasks over the available cloud resources. The proposed heuristic presents a beneficial trade-off between execution time and execution cost under given constraints. The proposed heuristic is evaluated for different synthetic workflow applications by a simulation process and comparison is done with state-of-art algorithm i.e. BHEFT. The simulation results show that our proposed scheduling heuristic can significantly decrease the execution cost while producing makespan as good as the best known scheduling heuristic under the same deadline and budget constraints.  相似文献   

8.
云服务提供商在给用户提供海量虚拟资源的同时,也面临着一个现实的问题,即怎样调度这些资源,以最小的代价(完工时间、执行费用、资源利用率等)完成工作流的执行。针对IaaS环境下的工作流调度问题,以完工时间和执行费用作为目标,提出了一种基于分解的多目标工作流调度算法。该算法结合了基于列表的启发式算法和多目标进化算法的选择过程,采用一种分解方法,将多目标优化问题分解为一组单目标优化子问题,然后同时求解这些单目标子问题,使得调度过程更为简单有效。算法利用天马项目发布的现实世界中的工作流进行实验,结果表明,和MOHEFT算法以及NSGA-II*算法相比较,所提出的算法能得到更优的Pareto解集,同时具有更低的时间复杂度。  相似文献   

9.
现如今,如何在满足截止时间约束的前提下降低工作流的执行成本,是云中工作流调度的主要问题之一。三步列表调度算法可以有效解决这一问题。但该算法在截止时间分配阶段只能形成静态的子截止时间。为方便用户部署工作流任务,云服务商为用户提供了的三种实例类型,其中竞价实例具有非常大的价格优势。为解决上述问题,提出了截止时间动态分配的工作流调度成本优化算法(S-DTDA)。该算法利用粒子群算法对截止时间进行动态分配,弥补了三步列表调度算法的缺陷。在虚拟机选择阶段,该算法在候选资源中增加了竞价实例,大大降低了执行成本。实验结果表明,相较于其他经典算法,该算法在实验成功率和执行成本上具有明显优势。综上所述,S-DTDA算法可以有效解决工作流调度中截止时间约束的成本优化问题。  相似文献   

10.
针对具有截止期的云工作流完成时间与执行成本冲突的问题,提出一种混合自适应粒子群工作流调度优化算法(HAPSO)。首先,基于截止期建立有向无环图(DAG)云工作流调度模型;然后,通过范数理想点与自适应权重的结合,将DAG调度模型转化为权衡DAG完成时间和执行成本的多目标优化问题;最后,在粒子群优化(PSO)算法的基础上引入自适应惯性权重、自适应学习因子、花朵授粉算法的概率切换机制、萤火虫算法(FA)和粒子越界处理方法,从而平衡粒子群的全局搜索与局部搜索能力,进而求解DAG完成时间与执行成本的目标优化问题。实验中对比分析了PSO、惯性权重粒子群算法(WPSO)、蚁群算法(ACO)和HAPSO的优化结果。实验结果表明,HAPSO在权衡工作流(30~300任务数)完成时间与执行成本的多目标函数值上降低了40.9%~81.1%,HAPSO在工作流截止期约束下有效权衡了完成时间与执行成本。此外,HAPSO在减少完成时间或降低执行成本的单目标上也有较好的效果,验证了HAPSO的普适性。  相似文献   

11.
Air Quality Forecasting (AQF) is a new discipline that attempts to reliably predict atmospheric pollution. An AQF application has complex workflows and in order to produce timely and reliable forecast results, each execution requires access to diverse and distributed computational and storage resources. Deploying AQF on Grids is one option to satisfy such needs, but requires the related Grid middleware to support automated workflow scheduling and execution on Grid resources. In this paper, we analyze the challenges in deploying an AQF application in a campus Grid environment and present our current efforts to develop a general solution for Grid-enabling scientific workflow applications in the GRACCE project. In GRACCE, an application’s workflow is described using GAMDL, a powerful dataflow language for describing application logic. The GRACCE metascheduling architecture provides the functionalities required for co-allocating Grid resources for workflow tasks, scheduling the workflows and monitoring their execution. By providing an integrated framework for modeling and metascheduling scientific workflow applications on Grid resources, we make it easy to build a customized environment with end-to-end support for application Grid deployment, from the management of an application and its dataset, to the automatic execution and analysis of its results.The work has been performed as part of the University of Houston’s Sun Microsystems Center of Excellence in Geosciences [38].  相似文献   

12.
多目标最优化云工作流调度进化遗传算法   总被引:1,自引:0,他引:1  
为了实现云环境中科学工作流调度的执行跨度和执行代价的同步优化,提出了一种多目标最优化进化遗传调度算法MOEGA。该算法以进化遗传为基础,定义了任务与虚拟机映射、虚拟机与主机部署间的编码机制,设计了满足多目标优化的适应度函数。同时,为了满足种群的多样性,在调度方案中引入了交叉与变异操作,并使用启发式方法进行种群初始化。通过4种现实科学工作流的仿真实验,将其与同类型算法进行了性能比较。结果表明,MOEGA算法不仅可以满足工作流截止时间约束,而且在降低任务执行跨度与执行代价的综合性能方面也优于其他算法。  相似文献   

13.
为了优化云工作流调度的经济代价和执行效率,提出一种基于有向无循环图(DAG)分割的工作流调度算法PBWS。以工作流调度效率与代价同步优化为目标,算法将调度求解过程划分为三个阶段进行:工作流DAG结构分割、分割结构调整及资源分配。工作流DAG结构分割阶段在确保任务间执行顺序依赖的同时求解初始的任务分割图;分割结构调整阶段以降低执行跨度为目标,在不同分割间对任务进行重分配;资源分配阶段旨在选择代价最高效的任务与资源映射关系,确保资源的总空闲时间最小。利用五种科学工作流DAG模型对算法进行了仿真实验。结果表明。PBWS算法仅以较小的执行跨度为开销,极大降低了工作流执行代价,实现了调度效率与调度代价的同步优化,其综合性能是优于同类型算法的。  相似文献   

14.
随着云计算的迅速发展,将工作流部署到云计算平台已经成为了常见的选择。相比于传统的本地工作流,云工作流不仅要考虑计算时长等要求,还要考虑其产生的经济开销。而云计算服务商为了提高资源利用率,提供了可抢占虚拟机实例这种非常廉价但是不稳定的资源。针对工作流在云计算中的调度和执行问题,提出一种满足工作流执行时限的可抢占虚拟机实例配置和调度方法。该方法使用马尔科夫模型和动态规划方法,对可抢占虚拟机实例的价格进行预测,并得到成本最低的出价策略。同时,结合工作流的执行时限要求,在估计的出价策略下对工作流中使用的实例进行配置。实验结果显示,相比于全部使用按需付费虚拟机实例,该方法在满足工作流执行时限的前提下最高可以节省89.9%的计算成本。  相似文献   

15.
Recently scientific communities produce a growing number of computation-intensive applications, which calls for the interoperation of distributed infrastructures including Clouds, Grids and private clusters. The European SHIWA and ER-flow projects have enabled the combination of heterogeneous scientific workflows, and their execution in a large-scale system consisting of multiple Distributed Computing Infrastructures. One of the resource management challenges of these projects is called parameter study job scheduling. A parameter study job of a workflow generally has a large number of input files to be consumed by independent job instances. In this paper we propose a meta-brokering framework for science gateways to support the execution of such workflows. In order to cope with the high uncertainty and unpredictable load of the utilized distributed infrastructures, we introduce the so called resource priority services. These tools are capable of determining and dynamically updating priorities of the available infrastructures to be selected for job instances. Our evaluations show that this approach implies an efficient distribution of job instances among the available computing resources resulting in shorter makespan for parameter study workflows.  相似文献   

16.
The workflow scheduling problem has drawn a lot of attention in the research community. This paper presents a workflow scheduling algorithm, called granularity score scheduling (GSS), which is based on the granularity of the tasks in a given workflow. The main objectives of GSS are to minimize the makespan and maximize the average virtual machine utilization. The algorithm consists of three phases, namely B-level calculation, score adjustment and task ranking and scheduling. We simulate the proposed algorithm using various benchmark scientific workflow applications, i.e., Cybershake, Epigenomic, Inspiral and Montage. The simulation results are compared with two well-known existing workflow scheduling algorithms, namely heterogeneous earliest finish time and performance effective task scheduling, which are also applied in cloud computing environment. Based on the simulation results, the proposed algorithm remarkably demonstrates its performance in terms of makespan and average virtual machine utilization.  相似文献   

17.
Due to the highly dynamic feature, dependable workflow scheduling is critical in the Grid environment. Various scheduling algorithms have been proposed, but seldom consider the resource reliability. Current Grid systems mainly exploit fault tolerance mechanism to guarantee the dependable workflow execution, which, however, wastes system resources. The paper proposes a dependable Grid workflow scheduling system (called DGWS). It introduces a Markov Chain-based resource availability prediction model. Based on the model, a reliability cost driven workflow scheduling algorithm is presented. The performance evaluation results, including the simulation on both parametric randomly generated DAGs and two real scientific workflow applications, demonstrate that compared to present workflow scheduling algorithms, DGWS improves the success ratio of tasks and diminishes the makespan of workflow, so improves the dependability of workflow execution in the dynamic Grid environments.  相似文献   

18.
Executing time critical applications within cloud environments while satisfying execution deadlines and response time requirements is challenging due to the difficulty of securing guaranteed performance from the underlying virtual infrastructure. Cost-effective solutions for hosting such applications in the Cloud require careful selection of cloud resources and efficient scheduling of individual tasks. Existing solutions for provisioning infrastructures for time constrained applications are typically based on a single global deadline. Many time critical applications however have multiple internal time constraints when responding to new input. In this paper we propose a cloud infrastructure planning algorithm that accounts for multiple overlapping internal deadlines on sets of tasks within an application workflow. In order to better compare with existing work, we adapted the IC-PCP algorithm and then compared it with our own algorithm using a large set of workflows generated at different scales with different execution profiles and deadlines. Our results show that the proposed algorithm can satisfy all overlapping deadline constraints where possible given the resources available, and do so with consistently lower host cost in comparison with IC-PCP.  相似文献   

19.
Workflow scheduling is a key issue and remains a challenging problem in cloud computing.Faced with the large number of virtual machine(VM)types offered by cloud providers,cloud users need to choose the most appropriate VM type for each task.Multiple task scheduling sequences exist in a workflow application.Different task scheduling sequences have a significant impact on the scheduling performance.It is not easy to determine the most appropriate set of VM types for tasks and the best task scheduling sequence.Besides,the idle time slots on VM instances should be used fully to increase resources'utilization and save the execution cost of a workflow.This paper considers these three aspects simultaneously and proposes a cloud workflow scheduling approach which combines particle swarm optimization(PSO)and idle time slot-aware rules,to minimize the execution cost of a workflow application under a deadline constraint.A new particle encoding is devised to represent the VM type required by each task and the scheduling sequence of tasks.An idle time slot-aware decoding procedure is proposed to decode a particle into a scheduling solution.To handle tasks'invalid priorities caused by the randomness of PSO,a repair method is used to repair those priorities to produce valid task scheduling sequences.The proposed approach is compared with state-of-the-art cloud workflow scheduling algorithms.Experiments show that the proposed approach outperforms the comparative algorithms in terms of both of the execution cost and the success rate in meeting the deadline.  相似文献   

20.
A growing number of data- and compute-intensive experiments have been modeled as scientific workflows in the last decade. Meanwhile, clouds have emerged as a prominent environment to execute this type of workflows. In this scenario, the investigation of workflow scheduling strategies, aiming at reducing its execution times, became a top priority and a very popular research field. However, few work consider the problem of data file assignment when solving the task scheduling problem. Usually, a workflow is represented by a graph where nodes represent tasks and the scheduling problem consists in allocating tasks to machines to be executed at a predefined time aiming at reducing the makespan of the whole workflow. In this article, we show that the scheduling of scientific workflows can be improved when both task scheduling and the data file assignment problems are treated together. Thus, we propose a new workflow representation, where nodes of the workflow graph represent either tasks or data files, and define the Task Scheduling and Data Assignment Problem (TaSDAP), considering this new model. We formulated this problem as an integer programming problem. Moreover, a hybrid evolutionary algorithm for solving it, named HEA-TaSDAP, is also introduced. To evaluate our approach we conducted two types of experiments: theoretical and practical ones. At first, we compared HEA-TaSDAP with the solutions produced by the mathematical formulation and by other works from related literature. Then, we considered real executions in Amazon EC2 cloud using a real scientific workflow use case (SciPhy for phylogenetic analyses). In all experiments, HEA-TaSDAP outperformed the other classical approaches from the related literature, such as Min–Min and HEFT.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号