首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Grid resource providers can use gossiping to disseminate their available resource state to remote regions of the grid to attract application load. Pairwise gossiping protocols exchange information about limited subsets of other resources between pairs of potentially remote participants. In epidemic gossiping protocols, the provider disseminates information to multiple neighbors, who in turn forward it to their neighbors, and so on. One important metric for these protocols is their coverage, which characterizes how many and which resources receive the information. Coverage characteristics of epidemic protocols are non-uniform, concentrated within the vicinity of a disseminating node; they can exhibit bi-modal behavior where information either reaches distant nodes or dies out quickly. Pairwise gossiping protocols, on the other hand, provide a more uniform coverage, but it can take longer for the dissemination to reach desired uniformity. In this paper, we study performance characteristics of three gossiping protocols: (1) epidemic gossiping, (2) pairwise gossiping, and (3) adaptive information dissemination (which is based on a form of epidemic gossiping). We report experimental results based on our simulation framework that compare the three protocols in terms of packet overhead and query satisfaction rates. We show that pairwise gossiping protocols work best when resource distribution on the grid is uniform, and that they can be configured to perform well in support of grid scheduling. We also verify this behavior under typical node failures of real-world production grids.  相似文献   

2.
Resource scheduling in large-scale distributed systems, such as grids and clouds, is difficult due to the size, dynamism, and volatility of resources. These resources are eclectic and autonomous, and may exhibit different usage policies, levels of participation, capabilities, local load, and reliability. Moreover, applications are likely to exhibit various patterns and levels, and distributed resources may organize into various different overlay topologies for information and query dissemination. Researchers have proposed a wide variety of approaches and policies for mapping offered load onto resources and for solving the various component parts of the scheduling problem. However, production clouds and grids may be underutilized, and may not exhibit the load to effectively characterize all of the scheduling system inputs. The composition of large-scale systems is also changing, potentially to include more individual and peer-to-peer resources. These factors will influence the effectiveness of proposed scheduling solutions. Therefore, a simulation environment is necessary to study different approaches under different scenarios, especially those that are expected, but that are not currently characteristic of existing systems. This article describes a general-purpose peer-to-peer simulation environment that allows a wide variety of parameters, protocols, strategies and policies to be varied and studied. To provide a proof of concept, utilization of the simulation environment is presented in a large-scale distributed system problem that includes a core model and related mechanisms. In particular, this article presents a definition and possible peer-to-peer solutions for the large-scale scheduling problem. Moreover, this article describes a general simulation model, some policies that can be varied, an implementation, and some sample results.  相似文献   

3.
While the majority of CPUs now sold contain multiple computing cores, current grid computing systems either ignore the multiplicity of cores, or treat them as distinct, independent machines. The latter approach ignores the resource contention present between cores in a single CPU, while the former approach fails to take advantage of significant computing power. We provide a decentralized resource management framework for exploiting multi-core nodes to run multi-threaded applications in peer-to-peer grids. We present two new load-balancing schemes that explicitly account for the resource sharing and contention of multiple cores, and propose a parameterized performance prediction model that can represent a continuum of resource sharing among cores of a CPU. We use extensive simulation to confirm that our two algorithms match jobs with computing nodes efficiently, and balance load during the lifetime of the computing jobs.  相似文献   

4.
基于Agent的网格互连结构   总被引:1,自引:0,他引:1  
目前有许多论坛、试验环境和研究项目都在进行网格技术的研究,但这些研究都自成体系,采用的技术各不相同,这些网格系统不能互连、互通、互操作.为了解决这一问题,本文提出基于Agent的网格互连结构,把Agent技术和网格技术结合起采,对网格互连的安全机制和资源管理机制进行了研究,并给出了网格互连结构的设计.在本文提出的结构解决了Inter—Domain网格安全机制不同的问题、Inter—Domain网格资源共享的问题,可实现Inter—Domain网格单点登录(Single—Sign—On)和代表(Delegation),具有通用、简单、高效、分布式等优点.  相似文献   

5.
根据分布式系统的静态和动态负载均衡策略的优缺点,提出了在网格计算环境下的混合负载均衡策略.为了让网络中节点在网格计算环境中有效地执行需要大量计算的复杂任务,提出了用来评估节点效率的函数,并结合模拟实验证实了在此函数下算法的优越性.  相似文献   

6.
Grid computing is a newly developed technology for complex systems with large-scale resource sharing, wide-area communication, and multi-institutional collaboration. Grid scheduling is an important infrastructure in the grid computing environment. Most of the existing grids scheduling methods focus on maximizing processor utilization without taking grid load into consideration. This may lead to significant inefficiencies in performance such as large job queues and processing delays. In this paper, we propose a multiagent-based scheduling system for computational grids with a new approach. Agent technology is suitable for a computational grid because of the dynamic, heterogeneous, and autonomous nature of the grid. The main idea of the proposed system is a combination of a static scheduling using a fixed scheduling algorithm and a dynamic adjustment through the autonomous behavior of agents. The superiority of the proposed system, in reducing the load of the grid and minimizing the response time for executing user applications, is demonstrated by simulation experiments.  相似文献   

7.
有效地利用网格需要掌握广泛分布的资源的最新信息.这样,就出现一个挑战性的问题,即网格的规模和不断变化的资源状态.本文提议使用非统一信息分发协议来有效地把信息分发给已分配好的信息仓库,不需要通过泛洪或者集中手段.观测资料表明:网格资源具有对邻近用户来说更为重要的特点,利用这个特点,使用非统一信息分发的解决方案,按距离资源的远近成比例地把资源信息反方向地分发出来.结果表明,与统一分发相比,这样做在费用方面有重大的缩减.  相似文献   

8.
Algorithmic mechanism design for load balancing in distributed systems   总被引:6,自引:0,他引:6  
Computational grids are promising next-generation computing platforms for large-scale problems in science and engineering. Grids are large-scale computing systems composed of geographically distributed resources (computers, storage etc.) owned by self interested agents or organizations. These agents may manipulate the resource allocation algorithm in their own benefit, and their selfish behavior may lead to severe performance degradation and poor efficiency. In this paper, we investigate the problem of designing protocols for resource allocation involving selfish agents. Solving this kind of problems is the object of mechanism design theory. Using this theory, we design a truthful mechanism for solving the static load balancing problem in heterogeneous distributed systems. We prove that using the optimal allocation algorithm the output function admits a truthful payment scheme satisfying voluntary participation. We derive a protocol that implements our mechanism and present experiments to show its effectiveness.  相似文献   

9.
Effective load distribution is of great importance at grids, which are complex heterogeneous distributed systems. In this paper we study site allocation scheduling of nonclairvoyant jobs in a 2-level heterogeneous grid architecture. Three scheduling policies at grid level which utilize site load information are examined. The aim is the reduction of site load information traffic, while at the same time mean response time of jobs and fairness in utilization between the heterogeneous sites are of great interest. A simulation model is used to evaluate performance under various conditions. Simulation results show that considerable decrement in site load information traffic and utilization fairness can be achieved at the expense of a slight increase in response time.  相似文献   

10.
In multicluster systems, and more generally in grids, jobs may require co‐allocation, that is, the simultaneous or coordinated access of single applications to resources of possibly multiple types in multiple locations managed by different resource managers. Co‐allocation presents new challenges to resource management in grids, such as locating sufficient resources in geographically distributed sites, allocating and managing resources in multiple, possibly heterogeneous sites for single applications, and coordinating the execution of single jobs at multiple sites. Moreover, as single jobs now may have to rely on multiple resource managers, co‐allocation introduces reliability problems. In this paper, we present the design and implementation of a co‐allocating grid scheduler named KOALA that meets these co‐allocation challenges. In addition, we report on the results of an analysis of the performance in our multicluster testbed of the co‐allocation policies built into KOALA . We also include the results of a performance and reliability test of KOALA while our testbed was unstable. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

11.
网格计算是为解决大规模资源密集型问题而提出的新一代计算平台,是当前并行和分布处理技术的一个发展方向,而资源管理是计算网格的关键技术之一。对各种各样可利用资源的整合和管理是网格应用的基础,而资源的分布性、动态性、异构性、自治性和需要协调一致性使得网格资源的管理调度成为一个棘手的问题。目前基于市场的经济资源管理和调度算法非常适合计算网格中的资源管理问题,但有调度价格不能更改、负载平衡等问题。文中提出了“网格环境下基于经济模型的资源代理”,依靠多维QoS指导的调度策略和经济模型的启发式调节资源价格,改进和优化计算网格资源的分配。  相似文献   

12.
The resource management is the central component of grid system. The analysis of the workload log file of LCG including the job arrival and the resource utilization daily cycle shows that the idle sites in the Grid are the source of load imbalance and energy waste. Here we focus on these two issues: balancing the workload by transferring jobs to idle sites at prime time to minimize the response time and maximize the resource utilization; power management by switch the idle sites to sleeping mode at non-prime time to minimize the energy consume. We form the M/G/1 queue model with server vacations, startup and closedown to analysis the performance metrics to instruct the design of load-balancing and energy-saving policies. We provide our Adaptive Receiver Initiated (ARI) load-balancing strategy and power-management policy for energy-saving. The simulation experiments prove the accuracy of our analysis and the comparisons results indicate our policies are largely suitable for large-scale heterogeneous grid environment.  相似文献   

13.
近年来,随着科学研究对计算资源的要求不断增加,结合分布式计算环境和互联网的网格计算已经得到越来越多研究者的关注。网格计算就是利用网络中的空闲计算资源来协助那些要求大量计算的复杂任务的执行。根据分布式系统的静态和动态负载均衡策略的优缺点,本文提出了在网格计算环境下的混合负载均衡策略。为了让网络中的节点在网格计算环境中有效地执行需要大量计算的复杂任务,并根据大量的实验总结,提出了新的用来评估节点效率的函数,较以前的函数执行效率有了提高。  相似文献   

14.
In this paper, we examine three general classes of space-sharing scheduling policies under a workload representative of large-scale scientific computing. These policies differ in the way processors are partitioned among the jobs as well as in the way jobs are prioritized for execution on the partitions. We consider new static, adaptive and dynamic policies that differ from previously proposed policies by exploiting user-supplied information about the resource requirements of submitted jobs. We examine the performance characteristics of these policies from both the system and user perspectives. Our results demonstrate that existing static schemes do not perform well under varying workloads, and that the system scheduling policy for such workloads must distinguish between jobs with large differences in execution times. We show that obtaining good performance under adaptive policies requires somea prioriknowledge of the job mix in these systems. We further show that a judiciously parameterized dynamic space-sharing policy can outperform adaptive policies from both the system and user perspectives.  相似文献   

15.
Besides the dynamic nature of grids, which means that resources may enter and leave the grid at any time, in many cases outside of the applications’ control, grid resources are also heterogeneous in nature. Many grid applications will be running in environments where interaction faults are more likely to occur between disparate grid nodes. As resources may also be used outside of organizational boundaries, it becomes increasingly difficult to guarantee that a resource being used is not malicious. Due to the diverse faults and failure conditions, developing, deploying, and executing long running applications over the grid remains a challenge. So fault tolerance is an essential factor for grid computing. This paper presents an extensive survey of different fault tolerant techniques such as replication strategies, check-pointing mechanisms, scheduling policies, failure detection mechanisms and finally malleability and migration support for divide-and-conquer applications. These techniques are used according to the needs of the computational grid and the type of environment, resources, virtual organizations and job profile it is supposed to work with. Each has its own merits and demerits which forms the subject matter of this survey.  相似文献   

16.
汤小春  赵全  符莹  朱紫钰  丁朝  胡小雪  李战怀 《软件学报》2022,33(12):4704-4726
Dataflow模型的使用,使得大数据计算的批处理和流处理融合为一体.但是,现有的针对大数据计算的集群资源调度框架,要么面向流处理,要么面向批处理,不适合批处理与流处理作业共享集群资源的需求.另外,GPU用于大数据分析计算时,由于缺乏有效的CPU-GPU资源解耦方式,降低了资源使用效率.在分析现有的集群资源调度框架的基础上,设计并实现了一种可以感知批处理/流处理应用的混合式资源调度框架HRM.它以共享状态架构为基础,采用乐观封锁协议和悲观封锁协议相结合的方式,确保流处理作业和批处理作业的不同资源要求.在计算节点上,提供CPU-GPU资源的灵活绑定,采用队列堆叠技术,不但满足流处理作业的实时性需求,也减少了反馈延迟并实现了GPU资源的共享.通过模拟大规模作业的调度,结果显示, HRM的调度延迟只有集中式调度框架的75%左右;使用实际负载测试,批处理与流处理共享集群时,使用HRM调度框架, CPU资源利用率提高25%以上;而使用细粒度作业调度方法,不但GPU利用率提高2倍以上,作业的完成时间也能够减少50%左右.  相似文献   

17.
Scheduling large-scale application in heterogeneous grid systems is a fundamental NP-complete problem that is critical to obtain good performance and execution cost. To achieve high performance in a grid system it requires effective task partitioning, resource management and load balancing. The heterogeneous and dynamic nature of a grid, as well as the diverse demands of applications running on the grid, makes grid scheduling a major task. Existing schedulers in wide-area heterogeneous systems require a large amount of information about the application and the grid environment to produce reasonable schedules. However, this required information may not be available, may be too expensive to collect, or may increase the runtime overhead of the scheduler such that the scheduler is rendered ineffective. We believe that no one scheduler is appropriate for all grid systems and applications. This is because while data parallel applications in which further data partitioning is possible can be further improved by efficient management of resources, smart selection of resources and load balancing can be possible, in functional/not-dividable-task parallel applications such partitioning is either not possible or difficult or expensive in term of performance. In this paper, we propose a scheduler for data parallel applications (SDPA) which offers an efficient task partitioning and load balancing strategy for data parallel applications in grid environment. The proposed SDPA offers two major features: maintaining job priority even if insufficient number of free resources is available and pre-task assignment to cut the idle time of nodes. The SDPA selects nodes smartly according to the nature of task and the nodes’ resources availability. Simulation results conducted reveal that SDPA achieves performance improvement over reported strategies in the reviewed literature in terms of execution time, throughput and waiting time.  相似文献   

18.
网格引擎是一个构建本地和集群网格的工具,其框架是由四种类型的主机及其对应的守护进程构成.该文主要研究了通过SGE框架构建分布式仿真网格平台的方法,描述了仿真网格平台上执行用户提交的仿真任务的工作流程.随后讨论了基于SGE仿真网格中的资源组织和作业调度,并分析了仿真网格中所使用的作业调度算法,包括确定作业顺序的FIFO算法、优先级算法、等额度和日历算法等;确定队列顺序的负载调整、队列号等算法等.  相似文献   

19.
基于计算经济的网格资源管理研究   总被引:2,自引:2,他引:0       下载免费PDF全文
网格是为解决大规模资源密集型问题而提出的新一代计算平台,资源管理是网格的关键技术之一。但是,资源的分布性、异构性、自治性、动态性等使得网格资源的管理变得异常复杂。目前,基于市场的经济资源管理和调度算法非常适合解决网格中的资源管理问题。本文提出了网格环境下基于经济模型的各种代理,给出了一种新的资源管理模型
型,并定义了效用函数,给出了基于效用最优的资源调度算法。为解决网格资源管理的问题提供了一个有效的途径。  相似文献   

20.
Dandan  Xiaohua  Ivan   《Computer Communications》2007,30(18):3627-3643
Location service provides position of mobile destination to source node to enable geo-routing. In existing quorum-based location service protocols, destination node registers its location along a ‘column’ while source node makes a query along a ‘row’. Grid and quorum-based location service is based on division of network into square grids, and selecting ‘leader’ location server node in each grid. Location updates, leader reelection and information transfer are performed whenever destination and leader nodes are moving to a different grid. We propose here to apply connected dominating sets (DS) as an alternative to grids. We also improved basic quorum, and applied on DS-quorum (DS based quorum) better criterion for triggering local information exchanges and global location updates, by meeting two criteria: certain distance movement and certain number of observed link changes with (DS) nodes. Backbones created by DS nodes (using 1-hop neighborhood information) are small size, do not have a parameter like grid size, and preserve network connectivity without the help of other nodes. Location updates and destination searches are restricted to backbone nodes. Both methods use ‘hello’ messages to learn neighbors. While this suffices to construct DS, grid leader (re)election requires additional messages. Simulation results show that using DS as backbone for quorum construction is superior to using grid as backbone or no backbone at all. The proposed DS-quorum location service can achieve higher (or similar) success rate with much less communication overhead than grid-based approaches.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号