首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
高性能计算系统的资源管理以集群作业管理为主,这种粗粒度的管理方式缺乏有效的作业资源控制手段,不能准确了解作业的资源需求,在一定程度上仍然不可避免计算资源的浪费.针对高性能计算系统中高效利用系统计算资源的问题,提出并实现了基于操作系统的QoS服务质量框架,对作业资源使用进行细粒度的统计与控制,实现了资源的动态控制与协商机制,完善作业加载与调度策略,在高效利用系统资源方面取得了较好的应用效果.  相似文献   

2.
面向服务的网格高性能计算策略   总被引:1,自引:0,他引:1  
网格技术和Web服务的发展,促成了服务计算的诞生和发展.本文在面向服务的架构下,重新研究传统计算网格下的高性能计算.首先,针舛高性能计算应用的特点,结合面向服务的思想,提出了一种层次资源管理体系结构.其次,分析了适用于网格环境的高性能计算应用的程序结构,并通过有向无循环图(DAG)加以表示.第三,基于上述的资源管理体系结构和高性能计算应用模型,提出了一种改进的动态优先级调度算法.最后,通过仿真实验,分析了提出的算法的性能,实验结果表明提出的算法适用于网格环境,进而验证了本文提出的面向服务的网格高性能计算策略的有效性.  相似文献   

3.
Clusters of computers have emerged as mainstream parallel and distributed platforms for high‐performance, high‐throughput and high‐availability computing. To enable effective resource management on clusters, numerous cluster management systems and schedulers have been designed. However, their focus has essentially been on maximizing CPU performance, but not on improving the value of utility delivered to the user and quality of services. This paper presents a new computational economy driven scheduling system called Libra, which has been designed to support allocation of resources based on the users' quality of service requirements. It is intended to work as an add‐on to the existing queuing and resource management system. The first version has been implemented as a plugin scheduler to the Portable Batch System. The scheduler offers market‐based economy driven service for managing batch jobs on clusters by scheduling CPU time according to user‐perceived value (utility), determined by their budget and deadline rather than system performance considerations. The Libra scheduler has been simulated using the GridSim toolkit to carry out a detailed performance analysis. Results show that the deadline and budget based proportional resource allocation strategy improves the utility of the system and user satisfaction as compared with system‐centric scheduling strategies. Copyright © 2004 John Wiley & Sons, Ltd.  相似文献   

4.
在面向互联网的计算资源共享平台中,如何把服务器端的子任务均匀地调度给大规模互联网环境下的志愿机运算是一个重要的研究问题。描述了该平台下的一个自适应并行调度模型。调度器处于服务器端与志愿机之间,缓解服务器端的访问瓶颈;服务器端首先根据调度器的负载对子任务进行第一次分派,在调度器端根据下属的志愿机的软硬信息再分配子任务。通过运行典型的BanchMark并行程序,把该调度策略与其他策略进行比较,验证了该调度模型针对粗粒度并行的主从(Master-Slave)风格并行应用可以获得较好的性能。  相似文献   

5.
Accurate, continuous resource monitoring and profiling are critical for enabling performance tuning and scheduling optimization. In desktop grid systems that employ sandboxing, these issues are challenging because (1) subjobs inside sandboxes are executed in a virtual computing environment and (2) the state of this virtual environment within the sandboxes is reset to an initial empty state after a subjob completion.DGMonitor is a monitoring tool which builds a global, accurate, and continuous view of real resource utilization for desktop grids with sandboxing. Our monitoring tool measures performance unobtrusively and reliably, uses a simple performance data model, and is easy to use. Our measurements demonstrate that DGMonitor can scale to large desktop grids (up to 12000 PCs) with low monitoring overhead in terms of resource consumption (less than 0.1% per machine).Though we originally developed DGMonitor with the Entropia DCGrid platform, our tool is easily portable and integrated into other desktop grid systems. In all of these systems, DGMonitor data can support existing and novel information services, particularly for performance tuning and scheduling. In this paper, the high scalability and monitoring power of DGMonitor are demonstrated with the Entropia DCGrid platform and the BOINC platform respectively.  相似文献   

6.
Understanding the behavior of large scale distributed systems is generally extremely difficult as it requires to observe a very large number of components over very large time. Most analysis tools for distributed systems gather basic information such as individual processor or network utilization. Although scalable because of the data reduction techniques applied before the analysis, these tools are often insufficient to detect or fully understand anomalies in the dynamic behavior of resource utilization and their influence on the applications performance. In this paper, we propose a methodology for detecting resource usage anomalies in large scale distributed systems. The methodology relies on four functionalities: characterized trace collection, multi‐scale data aggregation, specifically tailored user interaction techniques, and visualization techniques. We show the efficiency of this approach through the analysis of simulations of the volunteer computing Berkeley Open Infrastructure for Network Computing architecture. Three scenarios are analyzed in this paper: analysis of the resource sharing mechanism, resource usage considering response time instead of throughput, and the evaluation of input file size on Berkeley Open Infrastructure for Network Computing architecture. The results show that our methodology enables to easily identify resource usage anomalies, such as unfair resource sharing, contention, moving network bottlenecks, and harmful short‐term resource sharing. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

7.
高明  陈国扬 《计算机应用研究》2024,41(3):811-817+841
随着边缘计算的不断发展,其在资源管理配置方面逐渐出现相关问题,无服务器计算作为一种新的方式可以有效解决边缘计算的相关问题。然而,无服务器计算不具备在分布式边缘场景中高效处理请求所需服务负载调度的能力,针对这一问题,提出了一种基于无服务器边缘计算的服务负载调度算法(service load scheduling algorithm, SLSA)。SLSA的核心是通过隐式建模充分考虑了动态变化的节点状态、负载调度器放置等影响因素来优化整体时延,然后通过改进的平滑加权轮询调度(smooth weighted round robin, SWRR)算法进行服务调度。经仿真实验分析,SLSA在资源消耗上有着明显下降,同时在单城市场景与多城市场景下均有良好的性能表现,其中在单城市场景中相对于集中式轮询调度(round robin centralized, RRC)算法提升了43.01%,在多城市场景中提升了53.81%。实验结果表明,SLSA可以有效降低资源消耗率并提升性能。  相似文献   

8.
Energy efficiency is a major concern in modern high performance computing (HPC) systems and a power-aware scheduling approach is a promising way to achieve that. While there are a number of studies in power-aware scheduling by means of dynamic power management (DPM) and/or dynamic voltage and frequency scaling (DVFS) techniques, most of them only consider scheduling at a steady state. However, HPC applications like scientific visualization often need deadline constraints to guarantee timely completion. In this paper we present power-aware scheduling algorithms with deadline constraints for heterogeneous systems. We formulate the problem by extending the traditional multiprocessor scheduling and design approximation algorithms with analysis on the worst-case performance. We also present a pricing scheme for tasks in the way that the price of a task varies as its energy usage as well as largely depending on the tightness of its deadline. Last we extend the proposed algorithm to the control dependence graph and the online case which is more realistic. Through the extensive experiments, we demonstrate that the proposed algorithm achieves near-optimal energy efficiency, on average 16.4% better for synthetic workload and 12.9% better for realistic workload than the EDD (Earliest Due Date)-based algorithm; The extended online algorithm also outperforms the EDF (Earliest Deadline First)-based algorithm with an average up to 26% of energy saving and 22% of deadline satisfaction. It is experimentally shown as well that the pricing scheme provides a flexible trade-off between deadline tightness and price.  相似文献   

9.
云计算资源调度研究综述   总被引:27,自引:5,他引:22  
资源调度是云计算的一个主要研究方向.首先对云计算资源调度的相关研究现状进行深入调查和分析;然后重点讨论以降低云计算数据中心能耗为目标的资源调度方法、以提高系统资源利用率为目标的资源管理方法、基于经济学的云资源管理模型,给出最小能耗的云计算资源调度模型和最小服务器数量的云计算资源调度模型,并深入分析和比较现有的云资源调度方法;最后指出云计算资源管理的未来重要研究方向:基于预测的资源调度、能耗与性能折衷的调度、面向不同应用负载的资源管理策略与机制、面向计算能力(CPU、内存)和网络带宽的综合资源分配、多目标优化的资源调度,以便为云计算研究提供有益的参考.  相似文献   

10.
Cloud computing has permeated into the information technology industry in the last few years, and it is emerging nowadays in scientific environments. Science user communities are demanding a broad range of computing power to satisfy the needs of high‐performance applications, such as local clusters, high‐performance computing systems, and computing grids. Different workloads are needed from different computational models, and the cloud is already considered as a promising paradigm. The scheduling and allocation of resources is always a challenging matter in any form of computation and clouds are not an exception. Science applications have unique features that differentiate their workloads; hence, their requirements have to be taken into consideration to be fulfilled when building a Science Cloud. This paper will discuss what are the main scheduling and resource allocation challenges for any Infrastructure as a Service provider supporting scientific applications.  相似文献   

11.
Today, in an energy‐aware society, job scheduling is becoming an important task for computer engineers and system analysts that may lead to a performance per Watt trade‐off of computing infrastructures. Thus, new algorithms, and a simulator of computing environments, may help information and communications technology and data center managers to make decisions with a solid experimental basis. There are several simulators that try to address performance and, somehow, estimate energy consumption, but there are none in which the energy model is based on benchmark data that have been countersigned by independent bodies such as the Standard Performance Evaluation Corporation. This is the reason why we have implemented a performance and energy‐aware scheduling (PEAS) simulator for high‐performance computing. Furthermore, to evaluate the simulator, we propose an implementation of the non‐dominated sorting genetic algorithm‐II (NSGA‐II) algorithm, a fast and elitist multiobjective genetic algorithm, for the resource selection. With the help of the PEAS simulator, we have studied if it is possible to provide an intelligent job allocation policy that may be able to save energy and time without compromising performance. The results of our simulations show a great improvement in response time and power consumption. In most of the cases, NSGA‐II performs better than other ‘intelligent’ algorithms like multiobjective heterogeneous earliest finish time and clearly outperforms the first‐fit algorithm. We demonstrate the usefulness of the simulator for this type of studies and conclude that the superior behavior of multiobjective algorithms makes them recommended for use in modern scheduling systems. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

12.
网络集群计算系统中的并行任务调度   总被引:12,自引:0,他引:12  
基于多处理机并行任务调度模型,探讨网络集群计算系统中的并行任务调度问题,首先证明了一般网络集群计算系统中调度算法的可近似性难度,然后提出了三种不同的启发式算法:最大长度优先调度算法、最大宽度优先调度算法和最大面积优先调度算法;然后根据大量的模拟实验对这些算法以及文献中已提出的调度算法进行了比较分析,结果表明该文的启发式算法比文献中的算法在性能上效果更好。  相似文献   

13.
云计算平台利用虚拟化技术使软件应用变得更有效率的同时, 也给资源管理和服务调度带来了挑战。在研究了软件服务(SaaS)与基础设施服务(IaaS)调度的区别基础上, 重点考虑SaaS层的资源调度, 提出基于随机理论的调度模型, 把该层调度描述成一种多目标的优化问题。除了服务质量的要求, 还考虑了弹性这一云服务的重要特性, 并提供了任务调度与弹性服务副本的匹配策略。实验表明本调度机制的设计优化了云平台的整体性能, 达到了较好的负载均衡与资源利用率。  相似文献   

14.
以云计算效能提高为目的, 以研究Job调度和资源管理两大决定云计算环境能效的最重要因素为基础, 使用调度和管理对象对作为系统统计信息的参考指标进行采集的方式展开研究。性能代理和服务接口模式(PASI)已经从云环境中采集了原始数据, 通常情况下应用于云的安全技术属于被动防御模式, 如实时动态监控或者深度包检测模式(DPI), 其主要目的是实现动态检测和防御。提出了一个有效的框架以提高云计算中心的综合性能指标, 该架构的建立是基于PASI与DPI相结合的方式, 并且提出了一个基于安全PASI的Job调度和资源管理模型(JDRMSP)。实验结果表明, 该模型有效地提升了云环境的整体运作效能。  相似文献   

15.
While the majority of CPUs now sold contain multiple computing cores, current grid computing systems either ignore the multiplicity of cores, or treat them as distinct, independent machines. The latter approach ignores the resource contention present between cores in a single CPU, while the former approach fails to take advantage of significant computing power. We provide a decentralized resource management framework for exploiting multi-core nodes to run multi-threaded applications in peer-to-peer grids. We present two new load-balancing schemes that explicitly account for the resource sharing and contention of multiple cores, and propose a parameterized performance prediction model that can represent a continuum of resource sharing among cores of a CPU. We use extensive simulation to confirm that our two algorithms match jobs with computing nodes efficiently, and balance load during the lifetime of the computing jobs.  相似文献   

16.
The article considers the resource allocation and scheduling problem in a grid computing environment. The article proposes system optimisation scheduling (SOS) that provides a potential solution of joint optimisation of objectives for both the resource and application layer, which combine both application-oriented and resource-oriented scheduling benefits. Grid systems will strive to find an optimal relation between user satisfaction and resource utilisation. Utility functions are used to express grid user's Quality of Service requirement, resource provider's benefit function and system's objectives. In order to verify the efficiency of the proposed scheduling algorithm, we compare the performance of application optimisation scheduling, resource optimisation scheduling, SOS with a traditional Round-Robin algorithm. The simulations study the effect of the request rate and task-to-resource ratio on the different scheduling algorithm.  相似文献   

17.
随着网络技术的发展,群机计算管理软件作为对群机计算资源进行统一管理的中间件,越来越受到人们的关注。如何合理地管理地理上相对分散的计算资源,达到资源共享的目标,是群机系统发挥性能的关键。文章首先给出了一个基于资源集合概念的资源管理模型,在此基础上,研究了群机系统的资源分配和负载管理问题,并结合作业调度过程说明了资源管理模型与策略的应用。  相似文献   

18.
Symbolic computation has underpinned a number of key advances in Mathematics and Computer Science. Applications are typically large and potentially highly parallel, making them good candidates for parallel execution at a variety of scales from multi‐core to high‐performance computing systems. However, much existing work on parallel computing is based around numeric rather than symbolic computations. In particular, symbolic computing presents particular problems in terms of varying granularity and irregular task sizes that do not match conventional approaches to parallelisation. It also presents problems in terms of the structure of the algorithms and data. This paper describes a new implementation of the free open‐source GAP computational algebra system that places parallelism at the heart of the design, dealing with the key scalability and cross‐platform portability problems. We provide three system layers that deal with the three most important classes of hardware: individual shared memory multi‐core nodes, mid‐scale distributed clusters of (multi‐core) nodes and full‐blown high‐performance computing systems, comprising large‐scale tightly connected networks of multi‐core nodes. This requires us to develop new cross‐layer programming abstractions in the form of new domain‐specific skeletons that allow us to seamlessly target different hardware levels. Our results show that, using our approach, we can achieve good scalability and speedups for two realistic exemplars, on high‐performance systems comprising up to 32000 cores, as well as on ubiquitous multi‐core systems and distributed clusters. The work reported here paves the way towards full‐scale exploitation of symbolic computation by high‐performance computing systems, and we demonstrate the potential with two major case studies. © 2016 The Authors. Concurrency and Computation: Practice and Experience Published by John Wiley & Sons Ltd.  相似文献   

19.
云服务环境下最大特点是按需交付,通过虚拟化技术将相关资源构建统一调度池,并且按照用户需求为用户提供服务,因此,云服务具有并行计算、开放性以及按需交付特性.对于实训教学平台来说,在云计算环境下需要面对各种用户需求,如请求任务各种各样,实验任务类型不尽相同,设备资源存在较大差异,通过虚拟化技术来实现规范化管理何资源共享,对云资源进行调度来才能有效满足用户需求,为此,在本文中提出了云计算环境下实训教学平台动态迁移策略.策略设计了三层协同资源调度机制来实现对资源和任务管理,重点研究了任务分割、资源划分、资源调度策略等,在此基础上对系统进行仿真实验,验证云计算环境下实训教学平台动态迁移策略可行与有效性.  相似文献   

20.
对于分布式高性能计算系统来说,模拟免疫机理实现计算系统的性能监控和评估是一个崭新的研究途径.分析和比较了免疫机理和计算系统抗衰之间的异同,构建了基于多Agent的系统抗衰逻辑模型,模拟免疫机理对计算系统的性能进行监控、诊断和建立性能衰退的数学模型,并在仿真实验中评价了性能监控对于所监控的计算节点的影响.在此基础上以一个音像资源事务处理系统为背景进行了应用研究,给出了一个两阶段超指数分布的数学模型来评估性能.仿真实验和应用研究的结果表明方法是有效可行的.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号