首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 203 毫秒
1.
An ant algorithm for balanced job scheduling in grids   总被引:1,自引:1,他引:0  
Grid computing utilizes the distributed heterogeneous resources in order to support complicated computing problems. Grid can be classified into two types: computing grid and data grid. Job scheduling in computing grid is a very important problem. To utilize grids efficiently, we need a good job scheduling algorithm to assign jobs to resources in grids.In the natural environment, the ants have a tremendous ability to team up to find an optimal path to food resources. An ant algorithm simulates the behavior of ants. In this paper, we propose a Balanced Ant Colony Optimization (BACO) algorithm for job scheduling in the Grid environment. The main contributions of our work are to balance the entire system load while trying to minimize the makespan of a given set of jobs. Compared with the other job scheduling algorithms, BACO can outperform them according to the experimental results.  相似文献   

2.
针对计算密集型作业与数据密集型作业混合情况,在一个作业有时间限制的动态环境中,对传统的网格作业调度方法进行扩展,提出了三种网格作业调度启发式算法:Emin min、Ebest、Esufferage。并在一个由多个Cluster组成的、通过高速网络连接的网格模型上,对三种算法进行验证。与Min min算法的比较结果显示:三种算法均优于Min min算法。与ASJS算法比较结果显示:Emin min减少了等待时间与作业的makespan; Esufferage算法以减少作业完成量为代价,减少了作业的等待时间及makespan; Ebest在完成作业数量上与ASJS基本保持一致,但却增加了作业的等待时间与makespan。总体上,Emin min具有比较大的优势。  相似文献   

3.
Fine-Grained Cycle Sharing (FGCS) systems aim at utilizing the large amount of computational resources available on the Internet. In FGCS, host computers allow guest jobs to utilize the CPU cycles if the jobs do not significantly impact the local users. Such resources are generally provided voluntarily and their availability fluctuates highly. Guest jobs may fail unexpectedly, as resources become unavailable. To improve this situation, we consider methods to predict resource availability. This paper presents empirical studies on resource availability in FGCS systems and a prediction method. From studies on resource contention among guest jobs and local users, we derive a multi-state availability model. The model enables us to detect resource unavailability in a non-intrusive way. We analyzed the traces collected from a production FGCS system for 3 months. The results suggest the feasibility of predicting resource availability, and motivate our method of applying semi-Markov Process models for the prediction. We describe the prediction framework and its implementation in a production FGCS system, named iShare. Through the experiments on an iShare testbed, we demonstrate that the prediction achieves an accuracy of 86% on average and outperforms linear time series models, while the computational cost is negligible. Our experimental results also show that the prediction is robust in the presence of irregular resource availability. We tested the effectiveness of the prediction in a proactive scheduler. Initial results show that applying availability prediction to job scheduling reduces the number of jobs failed due to resource unavailability. This work was supported, in part, by the National Science Foundation under Grants No. 0103582-EIA, 0429535-CCF, and 0650016-CNS. We thank Ruben Torres for his help with the reference prediction algorithms used in our experiments.  相似文献   

4.
Fairness is an important aspect in queuing systems. Several fairness measures have been proposed in queuing systems in general and parallel job scheduling in particular. Generally, a scheduler is considered unfair if some jobs are discriminated whereas others are favored. Some of the metrics used to measure fairness for parallel job schedulers can imply unfairness where there is no discrimination (and vice versa). This makes them inappropriate. In this paper, we show how the existing approach misrepresents fairness in practice. We then propose a new approach for measuring fairness for parallel job schedulers. Our approach is based on two principles: (i) as jobs have different resource requirements and find different queue/system states, they need not have the same performance for the scheduler to be fair and (ii) to compare two schedulers for fairness, we make comparisons of how the schedulers favor/discriminate individual jobs. We use performance and discrimination trends to validate our approach. We observe that our approach can deduce discrimination more accurately. This is true even in cases where the most discriminated jobs are not the worst performing jobs. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

5.
In this paper we address a multicriteria scheduling problem for computational Grid systems. We focus on the two-level hierarchical Grid scheduling problem, in which at the first level (the Grid level) a Grid broker makes scheduling decisions and allocates jobs to Grid nodes. Jobs are then sent to the Grid nodes, where local schedulers generate local schedules for each node accordingly. A general approach is presented taking into account preferences of all the stakeholders of Grid scheduling (end-users, Grid administrators, and local resource providers) and assuming a lack of knowledge about job time characteristics. A single-stakeholder, single-criterion version of the approach has been compared experimentally with the existing approaches.  相似文献   

6.
We address non-preemptive non-clairvoyant online scheduling of parallel jobs on a Grid. We consider a Grid scheduling model with two stages. At the first stage, jobs are allocated to a suitable Grid site, while at the second stage, local scheduling is independently applied to each site. We analyze allocation strategies depending on the type and amount of information they require. We conduct a comprehensive performance evaluation study using simulation and demonstrate that our strategies perform well with respect to several metrics that reflect both user- and system-centric goals. Unfortunately, user run time estimates and information on local schedules does not help to significantly improve the outcome of the allocation strategies. When examining the overall Grid performance based on real data, we determined that an appropriate distribution of job processor requirements over the Grid has a higher performance than an allocation of jobs based on user run time estimates and information on local schedules. In general, our experiments showed that rather simple schedulers with minimal information requirements can provide a good performance.  相似文献   

7.
Data Grid integrates graphically distributed resources for solving data intensive scientific applications. Effective scheduling in Grid can reduce the amount of data transferred among nodes by submitting a job to a node, where most of the requested data files are available. Scheduling is a traditional problem in parallel and distributed system. However, due to special issues and goals of Grid, traditional approach is not effective in this environment any more. Therefore, it is necessary to propose methods specialized for this kind of parallel and distributed system. Another solution is to use a data replication strategy to create multiple copies of files and store them in convenient locations to shorten file access times. To utilize the above two concepts, in this paper we develop a job scheduling policy, called hierarchical job scheduling strategy (HJSS), and a dynamic data replication strategy, called advanced dynamic hierarchical replication strategy (ADHRS), to improve the data access efficiencies in a hierarchical Data Grid. HJSS uses hierarchical scheduling to reduce the search time for an appropriate computing node. It considers network characteristics, number of jobs waiting in queue, file locations, and disk read speed of storage drive at data sources. Moreover, due to the limited storage capacity, a good replica replacement algorithm is needed. We present a novel replacement strategy which deletes files in two steps when free space is not enough for the new replica: first, it deletes those files with minimum time for transferring. Second, if space is still insufficient then it considers the last time the replica was requested, number of access, size of replica and file transfer time. The simulation results show that our proposed algorithm has better performance in comparison with other algorithms in terms of job execution time, number of intercommunications, number of replications, hit ratio, computing resource usage and storage usage.  相似文献   

8.
In this paper, we propose a novel distributed resource-scheduling algorithm capable of handling multiple resource requirements for jobs that arrive in a Grid computing environment. In our proposed algorithm, referred to as multiple resource scheduling (MRS) algorithm, we take into account both the site capabilities and the resource requirements of jobs. The main objective of the algorithm is to obtain a minimal execution schedule through efficient management of available Grid resources. We first propose a model in which the job and site resource characteristics can be captured together and used in the scheduling algorithm. To do so, we introduce the concept of a n-dimensional virtual map and resource potential. Based on the proposed model, we conduct rigorous simulation experiments with real-life workload traces reported in the literature to quantify the performance. We compare our strategy with most of the commonly used algorithms in place on performance metrics such as job wait times, queue completion times, and average resource utilization. Our combined consideration of job and resource characteristics is shown to render high-performance with respect to above-mentioned metrics in the environment. Our study also reveals the fact that MRS scheme has a capability to adapt to both serial and parallel job requirements, especially when job fragmentation occurs. Our experimental results clearly show that MRS outperforms other strategies and we highlight the impact and importance of our strategy.  相似文献   

9.
Meta-schedulers map jobs to computational resources that are part of a Grid, such as clusters, that in turn have their own local job schedulers. Existing Grid meta-schedulers either target system-centric metrics, such as utilisation and throughput, or prioritise jobs based on utility metrics provided by the users. The system-centric approach gives less importance to users’ individual utility, while the user-centric approach may have adverse effects such as poor system performance and unfair treatment of users. Therefore, this paper proposes a novel meta-scheduler, based on the well-known double auction mechanism that aims to satisfy users’ service requirements as well as ensuring balanced utilisation of resources across a Grid. We have designed valuation metrics that commodify both the complex resource requirements of users and the capabilities of available computational resources. Through simulation using real traces, we compare our scheduling mechanism with other common mechanisms widely used by both existing market-based and traditional meta-schedulers. The results show that our meta-scheduling mechanism not only satisfies up to 15% more user requirements than others, but also improves system utilisation through load balancing.  相似文献   

10.
Most of current research in Grid computing is still focused on the improvement of the performance of Grid schedulers. However, unlike traditional scheduling, in Grid systems there are other important requirements to be taken into account. One such a requirement is the secure scheduling, namely achieving an efficient allocation of tasks to reasonable trustful resources. In this paper we formalize the Grid scheduling problem as a non-cooperative non-zero sum game of the Grid users in order to address the security requirements. The premise of this model is that in a large-scale Grid, the cooperation among all users in the system is unlikely to happen. The users’ cost of playing the game is interpreted as a total cost of the secure job execution in Grid. The game cost function is minimized, at global (Grid) and local (users) levels, by using four genetic-based hybrid meta-heuristics. We have evaluated the proposed model under the heterogeneity, the large-scale and dynamics conditions using a Grid simulator. The relative performance of four hybrid schedulers is measured by the makespan and flowtime metrics. The obtained results suggested that it is more resilient for the Grid users to pay some additional scheduling cost, due to verification of the security conditions, instead of taking the risk of assigning their tasks to unreliable resources.  相似文献   

11.
Grid resource management systems and schedulers are important components for building Grids. They are responsible for the selection and allocation of Grid resources to current and future applications. Thus, they are important building blocks for making Grids available to user communities. In this paper we briefly analyze the requirements of Grid resource management and provide a classification of schedulers. Then, we define an extensible formal model for Grid scheduling activities, and characterize the general Grid scheduling problem. Finally, we provide a reference architecture for the support of our model and discuss different aspects of architectural implementations.  相似文献   

12.
Unpredictable fluctuations in resource availability often lead to rescheduling decisions that sacrifice a success rate of job completion in batch job scheduling. To overcome this limitation, we consider the problem of assigning a set of sequential batch jobs with demands to a set of resources with constraints such as heterogeneous rescheduling policies and capabilities. The ultimate goal is to find an optimal allocation such that performance benefits in terms of makespan and utilization are maximized according to the principle of Pareto optimality, while maintaining the job failure rate close to an acceptably low bound. To this end, we formulate a multihybrid policy decision problem (MPDP) on the primary-backup fault tolerance model and theoretically show its NP-completeness. The main contribution is to prove that our multihybrid job scheduling (MJS) scheme confidently guarantees the fault-tolerant performance by adaptively combining jobs and resources with different rescheduling policies in MPDP. Furthermore, we demonstrate that the proposed MJS scheme outperforms the five rescheduling heuristics in solution quality, searching adaptability and time efficiency by conducting a set of extensive simulations under various scheduling conditions.  相似文献   

13.
This paper addresses the problem of minimizing the scheduling length (make-span) of a batch of jobs with different arrival times. A job is described by a direct acyclic graph (DAG) of parallel tasks. The paper proposes a dynamic scheduling method that adapts the schedule when new jobs are submitted and that may change the processors assigned to a job during its execution. The scheduling method is divided into a scheduling strategy and a scheduling algorithm. We also propose an adaptation of the Heterogeneous Earliest-Finish-Time (HEFT) algorithm, called here P-HEFT, to handle parallel tasks in heterogeneous clusters with good efficiency without compromising the makespan. The results of a comparison of this algorithm with another DAG scheduler using a simulation of several machine configurations and job types shows that P-HEFT gives a shorter makespan for a single DAG but scores worse for multiple DAGs. Finally, the results of the dynamic scheduling of a batch of jobs using the proposed scheduler method showed significant improvements for more heavily loaded machines when compared to the alternative resource reservation approach.  相似文献   

14.
Scheduling and resource allocation in large scale distributed environments, such as Computational Grids (CGs), arise new requirements and challenges not considered in traditional distributed computing environments. Among these new requirements, task abortion and security become needful criteria for Grid schedulers. The former arises due to the dynamics of the Grid systems, in which resources are expected to enter and leave the system in an unpredictable way. The latter requirement appears crucial in Grid systems mainly due to a multi-domain nature of CGs. The main aim of this paper is to develop a scheduling model that enables the aggregation of task abortion and security requirements as additional, together with makespan and flowtime, scheduling criteria into a cumulative objective function. We demonstrate the high effectiveness of genetic-based schedulers in finding near-optimal solutions for multi-objective scheduling problem, where all criteria (objectives) are simultaneously optimized. The proposed meta-heuristics are experimentally evaluated in static and dynamic Grid scenarios by using a Grid simulator. The obtained results show the fast reduction of the values of basic scheduler performance metrics, especially in the dynamic case, that confirms the usefulness of the proposed approach in real-life scenarios.  相似文献   

15.
Grid resources are typically diverse in nature with respect to their software and hardware configurations, resource usage policies and the kind of application they support. Aggregating and monitoring these resources, and discovering suitable resources for the applications become a challenging issue. This is partially due to the representation of Grid metadata supported by the existing Grid middleware which offers limited scope for matching the job requirements that directly affect scheduling decisions. This paper proposes a semantic component in conventional Grid architecture to support ontology‐based representation of Grid metadata and facilitate context‐based information retrieval that complements Grid schedulers for effective resource management. Web Ontology language is used for creating Grid resource ontology and Algernon inference engine has been used for resource discovery. This semantic component has been integrated with conventional Grid schedulers. Several experiments have also been carried out to investigate the performance overhead that arises while integrating this component with Grid schedulers. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

16.
This paper presents a three-stage algorithm for resource-aware scheduling of computational jobs in a large-scale heterogeneous data center. The algorithm aims to allocate job classes to machine configurations to attain an efficient mapping between job resource request profiles and machine resource capacity profiles. The first stage uses a queueing model that treats the system in an aggregated manner with pooled machines and jobs represented as a fluid flow. The latter two stages use combinatorial optimization techniques to solve a shorter-term, more accurate representation of the problem using the first-stage, long-term solution for heuristic guidance. In the second stage, jobs and machines are discretized. A linear programming model is used to obtain a solution to the discrete problem that maximizes the system capacity given a restriction on the job class and machine configuration pairings based on the solution of the first stage. The final stage is a scheduling policy that uses the solution from the second stage to guide the dispatching of arriving jobs to machines. We present experimental results of our algorithm on both Google workload trace data and generated data and show that it outperforms existing schedulers. These results illustrate the importance of considering heterogeneity of both job and machine configuration profiles in making effective scheduling decisions.  相似文献   

17.
Rapid advancement and more readily availability of Grid technologies have encouraged many businesses and researchers to establish Virtual Organizations (VO) and make use of their available desktop resources to solve computing intensive problems. These VOs, however, work as disjointed and independent communities with no resource sharing between them. We, in previous work, have proposed a fully decentralized and reconfigurable Inter-Grid framework for resource sharing among such distributed and autonomous Grid systems (Rao et al. in ICCSA, [2006]). The specific problem that underlies in such a collaborating Grids system is scheduling of resources as there is very little knowledge about availability of the resources due to the distributed and autonomous nature of the underlying Grid entities. In this paper, we propose a probabilistic and adaptive scheduling algorithm using system-generated predictions for Inter-Grid resource sharing keeping collaborating Grid systems autonomous and independent. We first use system-generated job runtime estimates without actually submitting jobs to the target Grid system. Then this job execution estimate is used to predict the job scheduling feasibility on the target system. Furthermore, our proposed algorithm adapted itself to the actual resource behavior and performance. Simulation results are presented to discuss the correctness and accuracy of our proposed algorithm.
Eui-Nam Huh (Corresponding author)Email:
  相似文献   

18.
A resource broker with a user-friendly interface for job submission developed on a platform constructed using the Globus toolkit is proposed. The broker employs a domain-based network information model and dynamic version to measure network statuses, and also monitors and collects resource statuses and network-related information as the basis of its brokerage. A network bandwidth-aware job scheduling algorithm for brokering suitable Grid resources to communication-intensive jobs based on improving and preserving the advantages of our previously developed network information model is also proposed. Using timely information, the resource broker effectively matches Grid resources and user requests, thus improving job execution efficiency.  相似文献   

19.
In this paper, we present a bandwidth-centric job communication model that captures the interaction and impact of simultaneously co-allocating jobs across multiple clusters. We compare our dynamic model with previous research that utilizes a fixed execution time penalty for co-allocated jobs. We explore the interaction of simultaneously co-allocated jobs and the contention they often create in the network infrastructure of a dedicated computational multi-cluster.We also present several bandwidth-aware co-allocating meta-schedulers. These schedulers take inter-cluster network utilization into account as a means by which to mitigate degraded job run-time performance. We make use of a bandwidth-centric parallel job communication model that captures the time-varying utilization of shared inter-cluster network resources. By doing so, we are able to evaluate the performance of multi-cluster scheduling algorithms that focus not only on node resource allocation, but also on shared inter-cluster network bandwidth.  相似文献   

20.
可靠的网格作业调度机制   总被引:1,自引:1,他引:0  
陶永才  石磊 《计算机应用》2010,30(8):2066-2069
针对网格环境的动态性特征,提出了一种可靠的网格作业调度机制(DGJS)。按照作业完成时间期限,DGJS将作业分为:高QoS级、低QoS级和无QoS级,不同QoS级作业有不同的调度优先权;基于资源可用性预测,DGJS采用基于可靠性代价的作业调度策略,将作业尽可能调度到可靠性高的资源节点;另外,DGJS对不同QoS级作业采用不同的容错策略,在保证故障容错的同时,节省网格资源。实验表明:在动态的网格环境下,较之传统的网格作业调度算法,DGJS提高了作业成功率,减少了作业完成时间。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号