首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Large scale distributed systems typically comprise hundreds to millions of entities (applications, users, companies, universities) that have only a partial view of resources (computers, communication links). How to fairly and efficiently share such resources between entities in a distributed way has thus become a critical question.  相似文献   

2.
The ongoing increase of energy consumption by IT infrastructures forces data center managers to find innovative ways to improve energy efficiency. The latter is also a focal point for different branches of computer science due to its financial, ecological, political, and technical consequences. One of the answers is given by scheduling combined with dynamic voltage scaling technique to optimize the energy consumption. The way of reasoning is based on the link between current semiconductor technologies and energy state management of processors, where sacrificing the performance can save energy.This paper is devoted to investigate and solve the multi-objective precedence constrained application scheduling problem on a distributed computing system, and it has two main aims: the creation of general algorithms to solve the problem and the examination of the problem by means of the thorough analysis of the results returned by the algorithms.The first aim was achieved in two steps: adaptation of state-of-the-art multi-objective evolutionary algorithms by designing new operators and their validation in terms of performance and energy. The second aim was accomplished by performing an extensive number of algorithms executions on a large and diverse benchmark and the further analysis of performance among the proposed algorithms. Finally, the study proves the validity of the proposed method, points out the best-compared multi-objective algorithm schema, and the most important factors for the algorithms performance.  相似文献   

3.
Distributed computing infrastructures are commonly used through scientific gateways, but operating these gateways requires important human intervention to handle operational incidents. This paper presents a self-healing process that quantifies incident degrees of workflow activities from metrics measuring long-tail effect, application efficiency, data transfer issues, and site-specific problems. These metrics are simple enough to be computed online and they make little assumptions on the application or resource characteristics. From their degree, incidents are classified in levels and associated to sets of healing actions that are selected based on association rules modeling correlations between incident levels. We specifically study the long-tail effect issue, and propose a new algorithm to control task replication. The healing process is parametrized on real application traces acquired in production on the European Grid Infrastructure. Experimental results obtained in the Virtual Imaging Platform show that the proposed method speeds up execution up to a factor of 4, consumes up to 26% less resource time than a control execution and properly detects unrecoverable errors.  相似文献   

4.
In distributed query processing systems, load balancing plays an important role in maximizing system throughput. When queries can leverage cached intermediate results, improving the cache hit ratio becomes as important as load balancing in query scheduling, especially when dealing with computationally expensive queries. The scheduling policies must be designed to take into consideration the dynamic contents of the distributed caching infrastructure. In this paper, we propose and discuss several distributed query scheduling policies that directly consider the available cache contents by employing distributed multidimensional indexing structures and an exponential moving average approach to predicting cache contents. These approaches are shown to produce better query plans and faster query response times than traditional scheduling policies that do not predict dynamic contents in distributed caches. We experimentally demonstrate the utility of the scheduling policies using MQO, which is a distributed, Grid-enabled, multiple query processing middleware system we developed to optimize query processing for data analysis and visualization applications.  相似文献   

5.
Scientific workflow orchestration interoperating HTC and HPC resources   总被引:1,自引:0,他引:1  
In this work we describe our developments towards the provision of a unified access method to different types of computing infrastructures at the interoperation level. For that, we have developed a middleware suite which bridges not interoperable middleware stacks used for building distributed computing infrastructures, UNICORE and gLite. Our solution allows to transparently access and operate on HPC and HTC resources from a single interface. Using Kepler as workflow manager, we provide users with the needed integration of codes to create scientific workflows accessing both types of infrastructures.  相似文献   

6.
Deadline-sensitive workflows require careful coordination of user constraints with resource availability. Current distributed resource access models provide varying degrees of resource control: from limited or none in grid batch systems to explicit in cloud systems. Additionally applications experience variability due to competing user loads, performance variations, failures, etc. These variations impact the quality of service (QoS) that goes unaccounted for in planning strategies. In this paper we propose Workflow ORchestrator for Distributed Systems (WORDS) architecture based on a least common denominator resource model that abstracts the differences and captures the QoS properties provided by grid and cloud systems. We investigate algorithms for effective orchestration (i.e., resource procurement and task mapping) for deadline-sensitive workflows atop the resource abstraction provided in WORDS. Our evaluation compares orchestration methodologies over TeraGrid and Amazon EC2 systems. Experimental results show that WORDS enables effective orchestration possible at reasonable costs on batch queue grid and cloud systems with or without explicit resource control.  相似文献   

7.
GridMD is a C++ class library intended for constructing simulation applications and running them in distributed environments. The library abstracts away from details of distributed environments, so that almost no knowledge of distributed computing is required from a physicist working with the library. She or he just uses GridMD function calls inside the application C++ code to perform parameter sweeps or other tasks that can be distributed at run-time. In this paper we briefly review the GridMD architecture. We also describe the job manager component which submits jobs to a remote system. The C++ source code of our PBS job manager may be used as a standalone tool and it is freely available as well as the full library source code. As illustrative examples we use simple expression evaluation codes and the real application of Coulomb cluster explosion simulation by Molecular Dynamics.  相似文献   

8.
Over the last years, comparative genomics analyses have become more compute-intensive due to the explosive number of available genome sequences. Comparative genomics analysis is an important a prioristep for experiments in various bioinformatics domains. This analysis can be used to enhance the performance and quality of experiments in areas such as evolution and phylogeny. A common phylogenetic analysis makes extensive use of Multiple Sequence Alignment (MSA) in the construction of phylogenetic trees, which are used to infer evolutionary relationships between homologous genes. Each phylogenetic analysis aims at exploring several different MSA methods to verify which execution produces trees with the best quality. This phylogenetic exploration may run during weeks, even when executed in High Performance Computing (HPC) environments. Although there are many approaches that model and parallelize phylogenetic analysis as scientific workflows, exploring all MSA methods becomes a complex and expensive task to be performed. If scientists determine a priorithe most adequate MSA method to use in the phylogenetic analysis, it would save time, and, in some cases, financial resources. Comparative genomics analyses play an important role in optimizing phylogenetic analysis workflows. In this paper, we extend the SciHmm scientific workflow, aimed at determining the most suitable MSA method, to use it in a phylogenetic analysis. SciHmm uses SciCumulus, a cloud workflow execution engine, for parallel execution. Experimental results show that using SciHmm considerably reduces the total execution time of the phylogenetic analysis (up to 80%). Experiments also show that trees built with the MSA program elected by using SciHmm presented more quality than the remaining, as expected. In addition, the parallel execution of SciHmm shows that this kind of bioinformatics workflow has an excellent cost/benefit when executed in cloud environments.  相似文献   

9.
Scientific workflows have become a valuable tool for large-scale data processing and analysis. This has led to the creation of specialized online repositories to facilitate workflow sharing and reuse. Over time, these repositories have grown to sizes that call for advanced methods to support workflow discovery, in particular for similarity search. Effective similarity search requires both high quality algorithms for the comparison of scientific workflows and efficient strategies for indexing, searching, and ranking of search results. Yet, the graph structure of scientific workflows poses severe challenges to each of these steps. Here, we present a complete system for effective and efficient similarity search in scientific workflow repositories, based on the Layer Decomposition approach to scientific workflow comparison. Layer Decomposition specifically accounts for the directed dataflow underlying scientific workflows and, compared to other state-of-the-art methods, delivers best results for similarity search at comparably low runtimes. Stacking Layer Decomposition with even faster, structure-agnostic approaches allows us to use proven, off-the-shelf tools for workflow indexing to further reduce runtimes and scale similarity search to sizes of current repositories.  相似文献   

10.
In distributed computing such as grid computing, online users submit their tasks anytime and anywhere to dynamic resources. Task arrival and execution processes are stochastic. How to adapt to the consequent uncertainties, as well as scheduling overhead and response time, are the main concern in dynamic scheduling. Based on the decision theory, scheduling is formulated as a Markov decision process (MDP). To address this problem, an approach from machine learning is used to learn task arrival and execution patterns online. The proposed algorithm can automatically acquire such knowledge without any aforehand modeling, and proactively allocate tasks on account of the forthcoming tasks and their execution dynamics. Under comparison with four classic algorithms such as Min–Min, Min–Max, Suffrage, and ECT, the proposed algorithm has much less scheduling overhead. The experiments over both synthetic and practical environments reveal that the proposed algorithm outperforms other algorithms in terms of the average response time. The smaller variance of average response time further validates the robustness of our algorithm.  相似文献   

11.
Petar   《Performance Evaluation》2007,64(9-12):1053-1061
The maximum weight matching algorithm is a high-performance scheduling algorithm for cross-bar switches. It is known that it performs optimally under heavy loads. However, its centralized nature and high computational complexity limit the algorithm’s applicability. This paper presents a randomized algorithm for distributed switch scheduling that is capable of delivering high throughput.  相似文献   

12.
The emergence of distributed artificial intelligent (DAI) introduced a new approach to solve scheduling problems by a set of scheduling systems that interact with each other in the problem-solving process. In this paper, we describe a communication infrastructure to handle connection and communication between distributed Internet scheduling systems for distributed applications. First, we present an agent model of distributed scheduling systems where agents can communicate and coordinate activities with each other via an agent communication language. Then, we define the syntax and semantics for the agent communication languages, and negotiation mechanism. Following that, we discuss the design and development of the prototype for the multi-agent scheduling systems. We conclude with a discussion of communication issues for heterogeneous agent-based scheduling systems to solve distributed scheduling problems.  相似文献   

13.
We study an on-line parallel job scheduling problem, where jobs arrive one by one. A parallel job may require a number of machines for its processing at the same time. Upon arrival of a job, its processing time and the number of requested machines become known, and it must be scheduled immediately without any knowledge of future jobs. We present a 7-competitive on-line algorithm, which improves the previous upper bound of 12 by Johannes (J. Sched. 9:433–452, 2006). Furthermore, we investigate a special case in which the largest processing time is known beforehand. A preliminary version of this paper appeared in Proceedings of the 11th Colloquium on Structural Information and Communication Complexity (SIROCCO’04, pp. 279-290). Research of D. Ye was supported by NSFC (10601048). Research of G. Zhang was supported by NSFC (60573020).  相似文献   

14.
We study several natural problems in distributed decision-making from the standpoint of competitive analysis; in these problems incomplete information is a result of the distributed nature of the problem, as opposed to the on-line mode of decision making that was heretofore prevalent in this area. In several simple situations of distributed scheduling, the competitive ratio can be computed exactly, and the different ratios can be used as a measure of the value of information and communication between decision-makers. In a more general distributed scheduling situation, we give tight upper and lower bounds on the competitive ratio achievable in the deterministic case, and give an optimal randomized algorithm with a much better competitive ratio.The research of Xiaotie Deng was supported by an NSERC grant and that of C. H. Papadimitriou was supported by an NSF grant.  相似文献   

15.
计算网格中动态负载平衡的分布调度模式   总被引:1,自引:0,他引:1  
网格计算下对资源进行有效的管理和调度可以提高系统的利用率.在对现有若干调度方法的研究和分析基础上,针对计算网格中的负载平衡问题,提出了一种分布式网格作业调度模型,并给出相关算法.算法通过建立主从模式的负载信息收集机制,提供给节点全局负载信息,加速重负载节点的负载转移速度.通过有效的负载平衡模式,解决资源调度中负载平衡及其可靠性问题.  相似文献   

16.
Cloud computing is a relatively new concept in the distributed systems and is widely accepted as a new solution for high performance and distributed computing. Its dynamisms in providing virtual resources for organisations and laboratories and its pay-per-use policy make it very popular. A workflow models a process consisting of a series of steps that shape an application. Workflow scheduling is the method for assigning each workflow task to a processing resource in a way that specific workflow rules are satisfied. Some scheduling algorithms for workflows may assume some quality of service parameter such as cost and deadline. Some efforts have been done on workflow scheduling on cloud computing environments with different service level agreements. But most of them suffer from low speed. Here, we introduce a new hybrid heuristic algorithm based on particle swarm optimisation (PSO) and gravitation search algorithms. The proposed algorithm, in addition to processing cost and transfer cost, takes deadline limitations into account. The proposed workflow scheduling approach can be used by both end-users and utility providers. The CloudSim toolkit is used as a cloud environment simulator and the Amazon EC2 pricing is the reference pricing used. Our experimental result shows about 70% cost reduction, in comparison to non-heuristic implementations, 30% cost reduction in comparison to PSO, 30% cost reduction in comparison to gravitational search algorithm and 50% cost reduction in comparison to hybrid genetic-gravitational algorithm.  相似文献   

17.
We survey results from distributed computing that show tasks to be impossible, either outright or within given resource bounds, in various models. The parameters of the models considered include synchrony, fault-tolerance, different communication media, and randomization. The resource bounds refer to time, space and message complexity. These results are useful in understanding the inherent difficulty of individual problems and in studying the power of different models of distributed computing. There is a strong emphasis in our presentation on explaining the wide variety of techniques that are used to obtain the results described.Received: September 2001, Accepted: February 2003,  相似文献   

18.
The scheduling of a many-task workflow in a distributed computing platform is a well known NP-hard problem. The problem is even more complex and challenging when the virtualized clusters are used to execute a large number of tasks in a cloud computing platform. The difficulty lies in satisfying multiple objectives that may be of conflicting nature. For instance, it is difficult to minimize the makespan of many tasks, while reducing the resource cost and preserving the fault tolerance and/or the quality of service (QoS) at the same time. These conflicting requirements and goals are difficult to optimize due to the unknown runtime conditions, such as the availability of the resources and random workload distributions. Instead of taking a very long time to generate an optimal schedule, we propose a new method to generate suboptimal or sufficiently good schedules for smooth multitask workflows on cloud platforms.Our new multi-objective scheduling (MOS) scheme is specially tailored for clouds and based on the ordinal optimization (OO) method that was originally developed by the automation community for the design optimization of very complex dynamic systems. We extend the OO scheme to meet the special demands from cloud platforms that apply to virtual clusters of servers from multiple data centers. We prove the suboptimality through mathematical analysis. The major advantage of our MOS method lies in the significantly reduced scheduling overhead time and yet a close to optimal performance. Extensive experiments were carried out on virtual clusters with 16 to 128 virtual machines. The multitasking workflow is obtained from a real scientific LIGO workload for earth gravitational wave analysis. The experimental results show that our proposed algorithm rapidly and effectively generates a small set of semi-optimal scheduling solutions. On a 128-node virtual cluster, the method results in a thousand times of reduction in the search time for semi-optimal workflow schedules compared with the use of the Monte Carlo and the Blind Pick methods for the same purpose.  相似文献   

19.
如何进一步实现云计算环境下的资源利用最大化是目前研究的热点.建立云计算环境下的资源分配模型,云计算资源调度使用蝙蝠算法,同时引入膜计算概念,提出一种基于膜计算的蝙蝠算法,将膜系统内部分解为主膜和辅助膜,在辅助膜内进行蝙蝠的个体局部寻优,将优化后的个体传送到主膜间进行全局优化,从而达到了云计算资源优化分配要求.通过CloudSim平台与其他算法进行仿真对比表明算法提高了云计算环境下的系统处理时间和效率,使得云计算环境下的资源分配更加合理.  相似文献   

20.
Distributed Scheduling (DS) problems have attracted attention by researchers in recent years. DS problems in multi-factory production are much more complicated than classical scheduling problems because they involve not only the scheduling problems in a single factory, but also the problems in the higher level, which is: how to allocate the jobs to suitable factories. It mainly focuses on solving two issues simultaneously: (i) allocation of jobs to suitable factories and (ii) determination of the corresponding production schedules in each factory. Its objective is to maximize system efficiency by finding an optimal plan for a better collaboration among various processes. However, in many papers, machine maintenance has usually been ignored during the production scheduling. In reality, every machine requires maintenance, which will directly influence the machine's availability, and consequently the planned production schedule. The objective of this paper is to propose a modified genetic algorithm approach to deal with those DS models with maintenance consideration, aiming to minimize the makespan of the jobs. Its optimization performance has been compared with other existing approaches to demonstrate its reliability. This paper also tests the influence of the relationship between the maintenance repairing time and the machine age to the performance of scheduling of maintenance during DS in the studied models.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号