首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 265 毫秒
1.
Monitoring of the system performance in highly distributed computing environments is a wide research area. In cloud and grid computing, it is usually restricted to the utilization and reliability of the resources. However, in today’s Computational Grids (CGs) and Clouds (CCs), the end users may define the special personal requirements and preferences in the resource and service selection, service functionality and data access. Such requirements may refer to the special individual security conditions for the protection of the data and application codes. Therefore, solving the scheduling problems in modern distributed environments remains still challenging for most of the well known schedulers, and the general functionality of the monitoring systems must be improved to make them efficient as schedulers supporting modules.In this paper, we define a novel model of security-driven grid schedulers supported by an Artificial Neural Network (ANN). ANN module monitors the schedule executions and learns about secure task–machine mappings from the observed machine failures. Then, the metaheuristic grid schedulers (in our case—genetic-based schedulers) are supported by the ANN module through the integration of the sub-optimal schedules generated by the neural network, with the genetic populations of the schedules.The influence of the ANN support on the general schedulers’ performance is examined in the experiments conducted for four types of the grid networks (small, medium, large and very large grids), two security scheduling modes—risky and secure scenarios, and six genetic-based grid schedulers. The generated empirical results show the high effectiveness of such monitoring support in reducing the values of the major scheduling criteria (makespan and flowtime), the run times of the schedulers and the grid resource failures.  相似文献   

2.
Computational Grids (CGs) have become an appealing research area. They suggest a suitable environment for developing large scale parallel applications. CGs integrate a huge mount of distributed heterogeneous resources for constituting a powerful virtual supercomputer. Scheduling is the most important issue for enhancing the performance of CGs. Various strategies have been introduced, including static and dynamic behaviors. The former maps tasks to resources at submission time, while the latter operates at run time. While static scheduling is unsuitable for the dynamic Grid environment, scheduling in CGs is still more complex than the proposed dynamic solutions. This paper introduces a decentralized Adaptive Grid Scheduler (AGS) based on a novel rescheduling mechanism. AGS has several salient properties as it is; hybrid, adaptive, decentralized, and efficient. Also, AGS is a robust mechanism as it has the ability to; (i) detect resource failures, (ii) continue its functionality in spite of the failure existence, then (iii) recover back. Moreover, it integrates both static and dynamic scheduling behaviors. An initial static scheduling map is proposed for an input Direct Acyclic Graph (DAG). However, DAG tasks may be rescheduled if the performance of the allocated resources changes in away that may affect the tasks’ response time. AGS overcomes drawbacks of traditional schedulers by utilizing the mobile agent unique features to enhance the resource discovery and monitoring processes. Experimental results have shown that AGS outperforms traditional Grid schedulers as it introduces a better scheduling efficiency.  相似文献   

3.
Most of current research in Grid computing is still focused on the improvement of the performance of Grid schedulers. However, unlike traditional scheduling, in Grid systems there are other important requirements to be taken into account. One such a requirement is the secure scheduling, namely achieving an efficient allocation of tasks to reasonable trustful resources. In this paper we formalize the Grid scheduling problem as a non-cooperative non-zero sum game of the Grid users in order to address the security requirements. The premise of this model is that in a large-scale Grid, the cooperation among all users in the system is unlikely to happen. The users’ cost of playing the game is interpreted as a total cost of the secure job execution in Grid. The game cost function is minimized, at global (Grid) and local (users) levels, by using four genetic-based hybrid meta-heuristics. We have evaluated the proposed model under the heterogeneity, the large-scale and dynamics conditions using a Grid simulator. The relative performance of four hybrid schedulers is measured by the makespan and flowtime metrics. The obtained results suggested that it is more resilient for the Grid users to pay some additional scheduling cost, due to verification of the security conditions, instead of taking the risk of assigning their tasks to unreliable resources.  相似文献   

4.
1.引言随着计算机网络技术的发展,分布式实时应用的领域也越来越多,其典型的商业化产品有分布式虚拟现实系统、分布式多媒体协作系统、多选手在线网络游戏等。这些系统往往具有复杂的、严格的QoS需求,如:时间延迟、抖动、可靠性需求等。为了能更好地实现这些系统,其关键问题是要解决产生于不同终端系统的竞争者(Competitor)(如分布式线程、对CORBA对象的操作等)之间的灵活通信及很好地保持其端  相似文献   

5.
Entropic Grid Scheduling   总被引:1,自引:0,他引:1  
Computational Grids (CGs) are large scale dynamical networks of geographically distributed peer resource clusters. These clusters are independent but cooperating computing systems bound by a management framework for the provision of computing services, called Grid Services. In its basic form, the Grid scheduling problem consists in finding at least one cluster that has the capacity to handle, within the constraints of a specified quality of service, a user service request submitted to the CG. Since CGs span distinct management domains, the scheduling process has to be decentralized. Furthermore, it has to account for the ubiquitous uncertainty on the state of the CG. In this paper, we propose a scalable distributed Entropy-based scheduling approach that utilizes a Markov chain model to capture the dynamics of the service capacity state. An entropy-based quantification of the uncertainty on the service capacity information is developed and explicitly integrated within the proposed Grid scheduling approach. The performance of the proposed scheduling strategy is validated, through simulation, against a random delegation scheme and a load balancing-based scheduling strategy with respect to throughput, exploitation and convergence speed, respectively.  相似文献   

6.
Computational grids have become an appealing research area as they solve compute-intensive problems within the scientific community and in industry. A Grid computational power is aggregated from a huge set of distributed heterogeneous workers; hence, it is becoming a mainstream technology for large-scale distributed resource sharing and system integration. Unfortunately, current grid schedulers suffer from the haste problem, which is the schedule inability to successfully allocate all input tasks. Accordingly, some tasks fail to complete execution as they are allocated to unsuitable workers. Others may not start execution as suitable workers are previously allocated to other peers. This paper is the first to introduce the scheduling haste problem. It also presents a reliable grid scheduler. The proposed scheduler selects the most suitable worker to execute an input grid task using a fuzzy inference system. Hence, it minimizes the turnaround time for a set of grid tasks. Moreover, our scheduler is a system-oriented one as it avoids the scheduling haste problem. Experimental results have shown that the proposed scheduler outperforms traditional grid schedulers as it introduces a better scheduling efficiency.  相似文献   

7.
Grid resource management systems and schedulers are important components for building Grids. They are responsible for the selection and allocation of Grid resources to current and future applications. Thus, they are important building blocks for making Grids available to user communities. In this paper we briefly analyze the requirements of Grid resource management and provide a classification of schedulers. Then, we define an extensible formal model for Grid scheduling activities, and characterize the general Grid scheduling problem. Finally, we provide a reference architecture for the support of our model and discuss different aspects of architectural implementations.  相似文献   

8.
网格环境具有异构性、动态性和不可靠性,为了合理而经济地利用资源,本文提出一个基于QoS且具有容错性的任务调度算法,以时间和费用的预算以及时间和费用的权重比值作为QoS参数。使计算过程和通信过程重叠,以隐藏网络时延。本文用随机Petri网模型描述网格环境中的任务调度模型;定义了随机Petfi肉的可达图,用来分析任务调度模型的性能。通过分析和模拟,反映此算法能够在满足用户的时间和费用的限制,具有容错性,任务完成时间短,以及综合花费少等优点。  相似文献   

9.
Grid resources are typically diverse in nature with respect to their software and hardware configurations, resource usage policies and the kind of application they support. Aggregating and monitoring these resources, and discovering suitable resources for the applications become a challenging issue. This is partially due to the representation of Grid metadata supported by the existing Grid middleware which offers limited scope for matching the job requirements that directly affect scheduling decisions. This paper proposes a semantic component in conventional Grid architecture to support ontology‐based representation of Grid metadata and facilitate context‐based information retrieval that complements Grid schedulers for effective resource management. Web Ontology language is used for creating Grid resource ontology and Algernon inference engine has been used for resource discovery. This semantic component has been integrated with conventional Grid schedulers. Several experiments have also been carried out to investigate the performance overhead that arises while integrating this component with Grid schedulers. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

10.
为了解决动态、不稳定的网格环境下的可靠计算问题,提出一种基于冗余调度的可靠网格计算模型.首先给出计算网格系统可靠性的定义,并基于系统可靠性定义给出了冗余调度的可靠网格计算模型,设计了冗余调度算法,模拟实验结果证明了提出的模型可以提高计算网格任务调度的可靠性.为了使提出的模型更好应用于实际网格计算环境,给出基于概率的冗余度优化公式,将该公式引入到冗余调度模型,可以获得优化的调度冗余度,不仅可以提高任务调度系统的可靠性,而且能提高资源的利用率.  相似文献   

11.
The frequent and volatile unavailability of volunteer-based Grid computing resources challenges Grid schedulers to make effective job placements. The manner in which host resources become unavailable will have different effects on different jobs, depending on their runtime and their ability to be checkpointed or replicated. A multi-state availability model can help improve scheduling performance by capturing the various ways a resource may be available or unavailable to the Grid. This paper uses a multi-state model and analyzes a machine availability trace in terms of that model. Several prediction techniques then forecast resource transitions into the model’s states. We analyze the accuracy of our predictors, which outperform existing approaches. We also propose and study several classes of schedulers that utilize the predictions, and a method for combining scheduling factors. We characterize the inherent tradeoff between job makespan and the number of evictions due to failure, and demonstrate how our schedulers can navigate this tradeoff under various scenarios. Lastly, we propose job replication techniques, which our schedulers utilize to replicate those jobs that are most likely to fail. Our replication strategies outperform others, as measured by improved makespan and fewer redundant operations. In particular, we define a new metric for replication efficiency, and demonstrate that our multi-state availability predictor can provide information that allows our schedulers to be more efficient than others that blindly replicate all jobs or some static percentage of jobs.  相似文献   

12.
提出一种GPU集群下用户服务质量QoS感知的深度学习研发平台上的动态任务调度方法.采用离线评估模块对深度学习任务进行离线评测并构建计算性能预测模型.在线调度模块基于性能预测模型,结合任务的预期QoS,共同开展任务放置和任务执行顺序的调度.在一个分布式GPU集群实例上的实验表明,该方法相比其他基准策略能够实现更高的QoS保证率和集群资源利用率.  相似文献   

13.
Job scheduling is one of the key issues in the design of grid environments. The performance of the grid system severely degrades if a method does not exist to efficiently schedule the user jobs. In this article, a fully distributed, learning automata–based job scheduling algorithm is proposed for grid environments. The proposed method is composed of two types of procedures: in the first, a procedure is run at the grid nodes and in the second, the procedure is run at the schedulers. The proposed algorithm synchronizes the performance of the schedulers by the learning automata that select their actions using the pseudo-random number generators with the same seed. In this method, the grid computational capacity that is allocated to each scheduler is proportional to its workload. To show the efficiency of the proposed method, several simulation experiments were conducted under different grid scenarios. The obtained results show that the proposed algorithm outperforms several well-known methods in terms of makespan, flow time, and load balancing.  相似文献   

14.
网格服务管理是网格计算的核心问题。通过对于目前网格服务管理体系架构的三种模型进行分析和比较,基于开放式服务体系架构(OGSA),探讨了网格服务管理系统的功能需求,进而设计了一种层次化的网格服务管理模型HGSM,描述了模型的工作流程。将网格服务管理分为任务分解、静态调度和动态调度三种层次,讨论了HGSM的各个层次的相关功能模块,以有向无环图和高级随机Petri网分别对于任务分解和服务调度提出了相关算法,算法中的可实施谓词、随机开关、实施速率等描述可以直接在SPN求解软件的编程中实现,从而为构造一种层次化的网格服务管理模型提供一个可实现的有效途径。  相似文献   

15.
QoS guided Min-Min heuristic for grid task scheduling   总被引:75,自引:1,他引:74       下载免费PDF全文
Task scheduling is an integrated component of computing.With the emergence of Grid and ubiquitous computing,new challenges appear in task scheduling based on properties such as security,quality of service,and lack of central control within distributed administrative domains.A Grid task scheduling framework must be able to deal with these issues.One of the goals of Grid task scheduling is to achivev high system throughput while matching applications with the available computing resources.This matching of resources in a non-deterministically shared heterogeneous environment leads to concerns over Quality of Service (QoS).In this paper a novel QoS guided task scheduling algorithm for Grid computing is introduced.The proposed novel algorithm is based on a general adaptive scheduling heuristics that includes QoS guidance.The algorithm is evaluated within a simulated Grid environment.The experimental results show that the nwe QoS guided Min-Min heuristic can lead to significant performance gain for a variety of applications.The approach is compared with others based on the quality of the prediction formulated by inaccurate information.  相似文献   

16.
大规模数据分析环境中,经常存在一些持续时间较短、并行度较大的任务。如何调度这些低延迟要求的并发作业是目前研究的一个热点。现有的一些集群资源管理框架中,集中式调度器由于主节点的瓶颈无法达到低延迟的要求,而一些分布式调度器虽然达成了低延迟的任务调度,但在最优资源分配以及资源分配冲突方面存在一定的不足。从大规模实时作业的需求出发,设计和实现了一个分布式的集群资源调度框架,以满足大规模数据处理的低延迟要求。首先提出了两阶段调度框架以及优化后的两阶段多路调度框架;然后针对两阶段多路调度过程中存在的一些资源冲突问题,提出了基于负载平衡的任务转移机制,从而解决了各个计算节点的负载不平衡问题;最后使用实际负载以及一个模拟调度器对大规模集群中的任务调度框架进行了模拟和验证。对于实际负载,所提框架的调度延迟控制在理想调度的12%以内;在模拟环境下,该框架与集中式调度器相比在短时间任务的延迟上能够减少40%以上。  相似文献   

17.
网格任务的执行环境具有动态性、分布性等特征,为了能顺利完成任务并使其具有较好的执行效率,需要一种有效的策略来进行任务的调度.结合信息处理的特点,提出一种快速有效的网格任务调度算法.该算法采用历史信息预测任务的执行时间,根据任务的截止时间要求对子任务进行合理分组.最后,给出了该算法在网格模拟器上的测试结果,并与一些算法进行了比较.结果表明,本算法对大作业以及截止期限紧急的作业具有较好的调度效果.  相似文献   

18.
Grids facilitate creation of wide-area collaborative environment for sharing computing or storage resources and various applications. Inter-connecting distributed Grid sites through peer-to-peer routing and information dissemination structure (also known as Peer-to-Peer Grids) is essential to avoid the problems of scheduling efficiency bottleneck and single point of failure in the centralized or hierarchical scheduling approaches. On the other hand, uncertainty and unreliability are facts in distributed infrastructures such as Peer-to-Peer Grids, which are triggered by multiple factors including scale, dynamism, failures, and incomplete global knowledge.In this paper, a reputation-based Grid workflow scheduling technique is proposed to counter the effect of inherent unreliability and temporal characteristics of computing resources in large scale, decentralized Peer-to-Peer Grid environments. The proposed approach builds upon structured peer-to-peer indexing and networking techniques to create a scalable wide-area overlay of Grid sites for supporting dependable scheduling of applications. The scheduling algorithm considers reliability of a Grid resource as a statistical property, which is globally computed in the decentralized Grid overlay based on dynamic feedbacks or reputation scores assigned by individual service consumers mediated via Grid resource brokers. The proposed algorithm dynamically adapts to changing resource conditions and offers significant performance gains as compared to traditional approaches in the event of unsuccessful job execution or resource failure. The results evaluated through an extensive trace driven simulation show that our scheduling technique can reduce the makespan up to 50% and successfully isolate the failure-prone resources from the system.  相似文献   

19.
Grid computing is a largely adopted paradigm to federate geographically distributed data centers. Due to their size and complexity, grid systems are often affected by failures that may hinder the correct and timely execution of jobs, thus causing a non-negligible waste of computing resources. Despite the relevance of the problem, state-of-the-art management solutions for grid systems usually neglect the identification and handling of failures at runtime. Among the primary goals to be considered, we claim the need for novel approaches capable to achieve the objectives of scalable integration with efficient monitoring solutions and of fitting large and geographically distributed systems, where dynamic and configurable tradeoffs between overhead and targeted granularity are necessary. This paper proposes GAMESH, a Grid Architecture for scalable Monitoring and Enhanced dependable job ScHeduling. GAMESH is conceived as a completely distributed and highly efficient management infrastructure, concentrating on two crucial aspects for large-scale and multi-domain grid environments: (i) the scalable dissemination of monitoring data and (ii) the troubleshooting of job execution failures. GAMESH has been implemented and tested in a real deployment encompassing geographically distributed data centers across Europe. Experimental results show that GAMESH (i) enables the collection of measurements of both computing resources and conditions of task scheduling at geographically sparse sites, while imposing a limited overhead on the entire infrastructure, and (ii) provides a failure-aware scheduler able to improve the overall system performance, even in the presence of failures, by coordinating local job schedulers at multiple domains.  相似文献   

20.
在实时CORBA1.2的基础上扩展了一种分布式调度体系,这种调度体系可以获取全局信息,从而本地调度器可以获得更加准确的调度信息.这里只讨论本地调度器使用截止期调度策略的情况,因此这种调度器可以使得分布式任务在截止期之前完成的可能性增大.如果本地调度器使用其他的调度算法,可以在这种体系中给出相应的机制来提供更准确的调度信息.文中主要描述了这种调度体系的设计与实现.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号