首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 703 毫秒
1.
Grid computing is a newly developed technology for complex systems with large-scale resource sharing, wide-area communication, and multi-institutional collaboration. Grid scheduling is an important infrastructure in the grid computing environment. Most of the existing grids scheduling methods focus on maximizing processor utilization without taking grid load into consideration. This may lead to significant inefficiencies in performance such as large job queues and processing delays. In this paper, we propose a multiagent-based scheduling system for computational grids with a new approach. Agent technology is suitable for a computational grid because of the dynamic, heterogeneous, and autonomous nature of the grid. The main idea of the proposed system is a combination of a static scheduling using a fixed scheduling algorithm and a dynamic adjustment through the autonomous behavior of agents. The superiority of the proposed system, in reducing the load of the grid and minimizing the response time for executing user applications, is demonstrated by simulation experiments.  相似文献   

2.
树型网格计算环境下的独立任务调度   总被引:17,自引:1,他引:17  
任务调度是实现高性能网格计算的一个基本问题,然而,设计和实现高效的调度算法是非常具有挑战性的.讨论了在网格资源计算能力和网络通信速度异构的树型计算网格环境下,独立任务的调度问题.与实现最小化任务总的执行时间不同(该问题已被证明是NP难题),为该任务调度问题建立了整数线性规划模型,并从该线性规划模型中得到最优任务分配方案??各计算节点最优任务分配数.然后,基于最优任务分配方案,构造了两种动态的需求驱动的任务分配启发式算法:OPCHATA(optimization-based priority-computation heuristic algorithm for task allocation)和OPBHATA(optimization-basedpriority-bandwidth heuristic algorithm for task allocation).实验结果表明:在异构的树型计算网格环境下实现大量独立任务调度时,该算法的性能明显优于其他算法.  相似文献   

3.
All existing fault-tolerance job scheduling algorithms for computational grids were proposed under the assumption that all sites apply the same fault-tolerance strategy. They all ignored that each grid site may have its own fault-tolerance strategy because each site is itself an autonomous domain. In fact, it is very common that there are multiple fault-tolerance strategies adopted at the same time in a large-scale computational grid. Various fault-tolerance strategies may have different hardware and software requirements. For instance, if a grid site employs the job checkpointing mechanism, each computation node must have the following ability. Periodically, the computational node transmits the transient state of the job execution to the server. If a job fails, it will migrate to another computational node and resume from the last stored checkpoint. Therefore, in this paper we propose a genetic algorithm for job scheduling to address the heterogeneity of fault-tolerance mechanisms problem in a computational grid. We assume that the system supports four kinds fault-tolerance mechanisms, including the job retry, the job migration without checkpointing, the job migration with checkpointing, and the job replication mechanisms. Because each fault-tolerance mechanism has different requirements for gene encoding, we also propose a new chromosome encoding approach to integrate the four kinds of mechanisms in a chromosome. The risk nature of the grid environment is also taken into account in the algorithm. The risk relationship between jobs and nodes are defined by the security demand and the trust level. Simulation results show that our algorithm has shorter makespan and more excellent efficiencies on improving the job failure rate than the Min–Min and sufferage algorithms.  相似文献   

4.
一种基于QoS的自适应网格失效检测器   总被引:2,自引:0,他引:2  
董剑  左德承  刘宏伟  杨孝宗 《软件学报》2006,17(11):2362-2372
失效检测器是构建可靠的网格计算环境所必需的基础组件之一.由于网格中存在大量对失效检测有着不同QoS需求的分布式应用,对于一个网格失效检测器来说,为保持其有效性和可扩展性,应该既能够准确提供应用程序所需的失效检测QoS,又能够避免为满足不同QoS而设计多套失效检测器所产生的多余负载.基于QoS基本评价指标,采用PULL模式主动检测策略实现了一种新的失效检测器--GA-FD(adaptive failure detector for grid),可以同时支持多个应用程序定量描述的QoS需求,不需要关于消息行为和时钟同步的任何假设.同时,证明了GA-FD在部分同步模型下可实现一个◇P类的失效检测器,并给出了相应的实验及数据.  相似文献   

5.
Energy usage and its associated costs have taken on a new level of significance in recent years. Globally, energy costs that include the cooling of server rooms are now comparable to hardware costs, and these costs are on the increase with the rising cost of energy. As a result, there are efforts worldwide to design more efficient scheduling algorithms. Such scheduling algorithm for grids is further complicated by the fact that the different sites in a grid system are likely to have different ownerships. As such, it is not enough to simply minimize the total energy usage in the grid; instead one needs to simultaneously minimize energy usage between all the different providers in the grid. Apart from the multitude of ownerships of the different sites, a grid differs from traditional high performance computing systems in the heterogeneity of the computing nodes as well as the communication links that connect the different nodes together. In this paper, we propose a cooperative, power-aware game theoretic solution to the job scheduling problem in grids. We discuss our cooperative game model and present the structure of the Nash Bargaining Solution. Our proposed scheduling scheme maintains a specified Quality of Service (QoS) level and minimizes energy usage between all the providers simultaneously; energy usage is kept at a level that is sufficient to maintain the desired QoS level. Further, the proposed algorithm is fair to all users, and has robust performance against inaccuracies in performance prediction information.  相似文献   

6.
考虑网格资源异构、自治、动态等特性,讨论本地用户具有强占优先权情况下的任务调度问题,提出了TBBS(Time-Balancing Based Scheduling Algorithm)算法.建立调度优化模型,以期望完成时间最小为目标选择执行任务的最佳资源组合.以时间均衡策略将任务分解并调度到资源上执行,减少了子任务同步时因等待而产生的延时,获得较好的并行计算性能.采用重复调度策略,适应计算网格中资源的特性.  相似文献   

7.
Scheduling Policies for Processor Coallocation in Multicluster Systems   总被引:1,自引:0,他引:1  
Building multicluster systems out of multiple, geographically distributed clusters interconnected by high-speed wide-area networks can provide access to a larger computational power and to a wider range of resources. Jobs running on multiclusters and, more generally, in grids, may require (processor) coallocation, i.e., the simultaneous allocation of resources (processors) in different clusters or subsystems of a grid. In this paper, we propose four scheduling policies for processor coallocation in multiclusters, and we assess with simulations their performance under a wide variety of parameter settings. In particular, in our simulations we use synthetic workloads and workloads derived from the logs of actual systems and from runtime measurements. We conclude that although coallocation makes scheduling more difficult and the wide-area communication critically impacts the performance, there is a wide range of realistic applications that may benefit from coallocation. However, unrestricted coallocation is not recommended: Limiting the total job size or the number or the sizes of their components improves performance.  相似文献   

8.
为改进同构应用在计算网格中的执行性能,提出了子作业指派方法。对于计算密集的应用,任务间通信是可忽略的,故一个这样的作业被划分为若干子作业,不同的子作业被分别指派到不同的机群,该作业划分是根据网格负载平衡完成的。非计算密集的应用在多站点计算时很少取得令人满意的性能,故一个这样的作业被整体指派到某个机群。为找出最适合机群,对每个机群的处理机性能和处理机间通信性能进行测量,并根据应用性能模型预测作业运行时间。实验显示,该子作业指派方法在优化同构应用的执行性能上是有效的。  相似文献   

9.
An ant algorithm for balanced job scheduling in grids   总被引:1,自引:1,他引:0  
Grid computing utilizes the distributed heterogeneous resources in order to support complicated computing problems. Grid can be classified into two types: computing grid and data grid. Job scheduling in computing grid is a very important problem. To utilize grids efficiently, we need a good job scheduling algorithm to assign jobs to resources in grids.In the natural environment, the ants have a tremendous ability to team up to find an optimal path to food resources. An ant algorithm simulates the behavior of ants. In this paper, we propose a Balanced Ant Colony Optimization (BACO) algorithm for job scheduling in the Grid environment. The main contributions of our work are to balance the entire system load while trying to minimize the makespan of a given set of jobs. Compared with the other job scheduling algorithms, BACO can outperform them according to the experimental results.  相似文献   

10.
Pull-based overlays are used in some of today’s largest computational grids. Job agents are submitted to resources with the duty of retrieving real workload from a central queue at runtime and executing it. This model helps overcome the problems of direct job submission in the highly complex grid environments, namely, heterogeneity, imprecise status information, relatively high failure rates and slow adaptation to changes of grid conditions or user priorities. This article presents a distributed scheduling architecture for such late-binding overlays. In this architecture, execution nodes share a distributed hash table and cooperatively perform job assignment. As our experiments prove, scalability problems of centralized matching are avoided, achieving low and predictable scheduling overheads even for execution of large workflows, and total turnaround times are improved. This is in line with the predictions of a theoretical model of grid workflow execution that the article also discusses. Scalability makes fine-grained scheduling possible and enables new functionalities, like a distributed data cache shared by the execution nodes, which helps alleviate the commonly congested storage services. In addition, we show that our system is more resilient to problems like communication breakdowns between computation centres. Moreover, the new architecture is better prepared to deal with demanding scenarios like intense demand of popular data files or remote data processing.  相似文献   

11.
本文基于网格区域剖分,提出了一种新的非结构网格粒子输运Sn并行算法,实现了多个角方向和多个能群的同时计算,在计算的过程中不用进行优先级计算和优先级队列维护,只需要按照计算队列的次序组织并行计算。综合考虑所有方向和所有网格点的数据依赖关系,结合B-level优先级,提出了一种优先级计算方法,优先计算需要数据发送的任务,延迟需要接收数据的任务,达到减少处理器等待时间和计算与通信重叠的目的。使用本文的Sn并行算法和优先级队列针对二维粒子输运问题进行的数值实验表明,并行算法具有良好的并行计算加速效果,扩展到1 024个处理机时,相对64个处理机的并行效率达到52%。  相似文献   

12.
In grid computing, resource management and fault tolerance services are important issues. The availability of the selected resources for job execution is a primary factor that determines the computing performance. In this paper, we propose a resource manager for optimal resource selection. Our resource manager automatically selects the set of optimal resources among candidate resources that achieves optimal performance using a genetic algorithm. Typically, the probability of a failure is higher in the grid computing than in a traditional parallel computing and the failure of resources affects job execution fatally. Therefore, a fault tolerance service is essential in computational grids. And grid services are often expected to meet some minimum levels of Quality of Service (QoS) for a desirable operation. To address this issue, we also propose a fault tolerance service that satisfies QoS requirements. We extend the definition of failures from the conventional notion of failures in distribute systems in order to provide a fault tolerance service that deals with various types of resource failures, which include process failures, processor failures, and network failures. We also design and implement a fault detector and a fault manager. The implementation and simulation results indicate that our approaches are promising in that (1) the resource manager finds the optimal set of resources that guarantees efficient job execution, (2) the fault detector detects the occurrence of resource failures and (3) the fault manager guarantees that the submitted jobs complete and the performance of job execution is improved due to job migration even if some failures occur.  相似文献   

13.
Job scheduling in utility grids should take into account the incentives for both grid users and resource providers. However, most of existing studies on job scheduling in utility grids only address the incentive for one party, i.e., either the users or the resource providers. Very few studies on job scheduling in utility grids consider incentives for both parties, in which the cost, one of the most attractive incentives for users, is not addressed. In this paper, we study the job scheduling in utility grid by optimizing the incentives for both parties. We propose a multi-objective optimization approach, i.e., maximizing the successful execution rate of jobs and minimizing the combined cost (incentives for grid users), and minimizing the fairness deviation of profits (incentive for resource providers). The proposed multi-objective optimization approach could offer sufficient incentives for the two parties to stay and play in the utility grid. A heuristic scheduling algorithm called Cost-Greedy Price-Adjusting (CGPA) algorithm is developed to optimize the incentives for both parties. Simulation results show that the CGPA algorithm is effective and could lead to higher successful execution rate, lower combined cost and lower fairness deviation compared with some popular algorithms in most cases.  相似文献   

14.
Then-dimensional grid is one of the most representative patterns of data flow in parallel computation. Many scientific algorithms, which require nearest neighbor communication in a lattice space, are modeled by a task graph with the properties of a simple or enhanced grid. The two most frequently used scheduling models for grids are the unit execution time-zero communication delay (UET) and the unit execution time–unit communication time (UET-UCT). In this paper we introduce an enhanced model of then-dimensional grid by adding extra diagonal edges and allowing unequal boundaries for each dimension. For this generalized grid topology we establish the optimal makespan for both cases of UET/UET-UCT grids. Then we give a closed formula that calculates the minimum number of processors required to achieve the optimal makespan. Finally, we propose a low-complexity optimal time and processor scheduling strategy for both cases.  相似文献   

15.
针对网格计算中的多目标网格任务调度问题,提出了一种基于自适应邻域的多目标网格任务调度算法。该算法通过求解多个网格任务调度目标函数的非劣解集,采用自适应邻域的方法来保持网格任务调度多目标解集的分布性,尝试解决网格任务调度中多目标协同优化问题。实验结果证明,该算法能够有效地平衡时间维度和费用维度目标,提高了资源的利用率和任务的执行效率,与Min-min和Max-min算法相比具有较好的性能。  相似文献   

16.
Cooperation of multi-domain massively parallel processor systems in com- puting grid environment provides new opportunities for multisite job scheduling. At the same time, in the area of co-allocation, heterogeneity, network adaptability and scalability raise the challenge for the international design of multisite job scheduling models and algorithms. It presents multisite job scheduling schema through the introduction of mul- tisite job scheduling model and the performance model under the grid environment. It introduces two job multisite and cooperative scheduling models and algorithms with the core of the optimal and greedy-heuristic resource selection strategies. Meanwhile, com- pared with single and multisite cooperative scheduling models and algorithms introduced by Sabin, Yahyapour and other persons, the validity and advance of the scheduling model and the performance model herein are proved.  相似文献   

17.
Game-Theoretic Approach for Load Balancing in Computational Grids   总被引:1,自引:0,他引:1  
Load balancing is a very important and complex problem in computational grids. A computational grid differs from traditional high-performance computing systems in the heterogeneity of the computing nodes, as well as the communication links that connect the different nodes together. There is a need to develop algorithms that can capture this complexity yet can be easily implemented and used to solve a wide range of load-balancing scenarios. In this paper, we propose a game-theoretic solution to the grid load-balancing problem. The algorithm developed combines the inherent efficiency of the centralized approach and the fault-tolerant nature of the distributed, decentralized approach. We model the grid load-balancing problem as a noncooperative game, whereby the objective is to reach the Nash equilibrium. Experiments were conducted to show the applicability of the proposed approaches. One advantage of our scheme is the relatively low overhead and robust performance against inaccuracies in performance prediction information.  相似文献   

18.
提出一种基于树型计算网格的自适应调度算法,实现对小粒度独立任务和用户大作业的自适应最优调度。通过对网格环境的实时检测,给出了基于节点负载状况、节点任务执行时间、任务传输时间和任务特性的自适应调度算法,即基于最优任务分配方案的启发式任务调度算法。通过实验与其他调度算法的比较,证明了所提出的任务调度算法在负载平衡和最优跨度方面具有明显的优越性。  相似文献   

19.
网格中资源之间存在着通信延迟,通过任务复制的冗余,可以减少任务之间的通信开销,缩短整个计算程序的计算时间。目前网格中的任务调度算法基本上是没有考虑任务复制的;而基于任务复制调度算法往往会产生过多的复制任务,增大系统开销,甚至有可能延迟计算时间。由于基于任务复制的任务调度是一个NP问题,因此本文提出了一种基于任务复制的网格资源调度算法,以减少调度长度为主要目标、减少任务复制量和资源占用量为次要目标。该算法在调度长度和任务复制数量以及占用资源数量方面都等于或优于其它算法。  相似文献   

20.
针对网格环境下计算节点的自治性、异构性、分布性等特征,提出一种基于任务响应时间的动态修正预测和任务流整形的网格调度算法,该调度方法依据历史数据和最近访问过计算节点的任务请求提交时间、任务完成时间、网络通信延迟等信息,预测计算节点的将来任务响应时间,将任务提交给预测的轻负载或性能较优的计算节点完成。通过使用动态修正算法和任务流整形算法降低预测误差,提高资源利用率。实验结果表明,该方法在任务响应时间、任务的吞吐率等方面优于随机调度等传统算法,具有较好的综合性能。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号