首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 22 毫秒
1.
Load sharing in large, heterogeneous distributed systems allows users to access vast amounts of computing resources scattered around the system and may provide substantial performance improvements to applications. We discuss the design and implementation issues in Utopia, a load sharing facility specifically built for large and heterogeneous systems. The system has no restriction on the types of tasks that can be remotely executed, involves few application changes and no operating system change, supports a high degree of transparency for remote task execution, and incurs low overhead. The algorithms for managing resource load information and task placement take advantage of the clustering nature of large-scale distributed systems; centralized algorithms are used within host clusters, and directed graph algorithms are used among the clusters to make Utopia scalable to thousands of hosts. Task placements in Utopia exploit the heterogeneous hosts and consider varying resource demands of the tasks. A range of mechanisms for remote execution is available in Utopia that provides varying degrees of transparency and efficiency. A number of applications have been developed for Utopia, ranging from a load sharing command interpreter, to parallel and distributed applications, to a distributed batch facility. For example, an enhanced Unix command interpreter allows arbitrary commands and user jobs to be executed remotely, and a parallel make facility achieves speed-ups of 15 or more by processing a collection of tasks in parallel on a number of hosts.  相似文献   

2.
对于分布式高性能计算系统来说,模拟免疫机理实现计算系统的性能监控和评估是一个崭新的研究途径.分析和比较了免疫机理和计算系统抗衰之间的异同,构建了基于多Agent的系统抗衰逻辑模型,模拟免疫机理对计算系统的性能进行监控、诊断和建立性能衰退的数学模型,并在仿真实验中评价了性能监控对于所监控的计算节点的影响.在此基础上以一个音像资源事务处理系统为背景进行了应用研究,给出了一个两阶段超指数分布的数学模型来评估性能.仿真实验和应用研究的结果表明方法是有效可行的.  相似文献   

3.
Programming distributed applications supporting data sharing is very hard. In most middleware platforms, programmers must deal with system-level issues for which they do not have the adequate knowledge, e.g., object replication, abusive resource consumption by mobile agents, and distributed garbage collection. As a result, programmers are diverted from their main task: the application logic. In addition, given that such system-level issues are extremely error-prone, programmers spend inumerous hours debugging. We designed, implemented, and evaluated a middleware platform called OBIWAN that releases the programmer from the above mentioned system-level issues. OBIWAN has the following distinctive characteristics: 1) allows the programmer to develop applications using either remote object invocation, object replication, or mobile agents, according to the specific needs of applications, 2) supports automatic object replication (e.g., incremental on-demand replication, transparent object faulting and serving, etc.), 3) supports distributed garbage collection of useless replicas, and 4) supports the specification and enforcement of history-based security policies well adapted to mobile agents needs (e.g., preventing abusive resource consumption).  相似文献   

4.
网格应用必须适应动态变化的运行环境。该文探讨动态且自适应的资源管理模式,设计并实现一种面向服务的分布式虚拟机——Abacus虚拟机。根据资源管理策略与运行时的可用资源情况,Abacus虚拟机自适应地在分布式环境中为应用程序分配资源。实验结果显示,该自适应的资源管理方式是可行有效的。  相似文献   

5.
The dynamic distributed real-time applications run on clusters with varying execution time, so re-allocation of resources is critical to meet the applications’s deadline. In this paper we present two adaptive recourse management techniques for dynamic real-time applications by employing the prediction of responses of real-time tasks that operate in time sharing environment and run-time analysis of scheduling policies. Prediction of response time for resource reallocation is accomplished by historical profiling of applications’ resource usage to estimate resource requirements on the target machine and a probabilistic approach is applied for calculating the queuing delay that a process will experience on distributed hosts. Results show that as compared to statistical and worst-case approaches, our technique uses system resource more efficiently.  相似文献   

6.
基于LDAP的网格监控系统   总被引:40,自引:3,他引:40  
对于高性能分布计算环境-网格-来说,监控其中计算资源的状态是至关重要的。通过监控可以及时发现并排除故障。通过分析监控数据可以找出性能瓶颈,为系统调整提供可靠的依据,GridMon是基于LDAP目录服务的分布式网格监控系统同,改变了以往目录服务不存储动态信息的使用方法,灵活地将静态和动态信息结合在目录层次中,从而减少了客房端对服务器的交互次数,并采用中间件技术有效地解决了直接访问被监控主机带来的安全和接口问题。借助LDAP的目录层次,建立了网格系统树状基本结构。提出了网格监控对象和监控事件的概念及其表示方法,从而形成完整的网格监控结构模型,详细讨论了根据这个模型实现的网格监控原型系统-GridMon。最后,通过网格与机群系统的结构不同点,阐述了评价网格监控系统的要点,并以此为依据,结合应用前景对GridMon进行了客观的评价。  相似文献   

7.
Abstract

Experimental applications of data from multispectral and other advanced sensors have demonstrated that remote sensing can make a valuable contribution to the monitoring and management of Canada's land resources. More frequent coverage and additional spectral bands on satellites planned for the mid-1980s and beyond will increase the opportunities for regular use of remotely sensed data. To effectively utilize these data in resource management, the remote sensing input must be matched with the resource management systems existing at that time. Thus, it is essential to anticipate the needs of resource management systems of the late 1980s and 1990s, to determine the appropriate role for remotely sensed data and to develop and implement a plan which will yield the remote sensing systems and methodologies necessary to meet the operational resource management requirements

Previous studies of resource information requirements indicate that there will be a need for geocoded remotely sensed data, improved image analysis techniques and better information integration concepts for future resource management systems. To develop a plan for meeting the anticipated requirements, the flow from the recording of the remotely sensed data to the end use of the derived information is considered first. The timeliness and accuracy requirements of different users, the diverse data types and forms for individual applications, the analysis methods/decision models needed and the implications of these factors for the configuration(s) of remote sensing input into the future resource management systems are examined. From this analysis, areas requiring further work (research, development, demonstration, transfer) are identified, and a plan of action is suggested.  相似文献   

8.
随着航天技术的发展,越来越多的信息需在星上处理,卫星群分布式技术已成为近年来研究的热点. 与地面环境不同,卫星受到体积、功耗、空间辐射等条件的限制,在卫星群建立分布式环境需掌握架构设计、操作系统、资源管理等关键技术. 针对以上问题,本文先研究卫星群分布式计算与存储架构、其次研究卫星群分布式资源监控的实现过程,最后通过卫星群验证系统证明其可行性与优越性.  相似文献   

9.
Resource management is an important aspect to consider regarding applications that might have different non‐functional or operational requirements, when running in distributed and heterogeneous environments. In this context, it is necessary to provide the means to specify the required resource constraints and an infrastructure that can adapt the applications in light of the changes in resource availability. We adopted a contract‐based approach to describe and maintain parallel applications that have non‐functional requirements in a Computing Grid context, called ZeliGrid. To form the supporting infrastructure we have designed a software architecture that integrates some of the Globus services, the LDAP and the NWS monitoring services. Some modules that map the contract approach into software artifacts were also integrated to this architecture. This paper addresses the architecture and integration issues of our approach, as well as how we put the pieces together highlighting deployment and implementation details, which have to consider diverse aspects such as monitoring, security and dynamic reconfiguration. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

10.
Distributed Java virtual machine (dJVM) systems enable concurrent Java applications to transparently run on clusters of commodity computers. This is achieved by supporting Java's shared‐memory model over multiple JVMs distributed across the cluster's computer nodes. In this work, we describe and evaluate selective dynamic diffing and lazy home allocation, two new runtime techniques that enable dJVMs to efficiently support memory sharing across the cluster. Specifically, the two proposed techniques can contribute to reduce the overheads due to message traffic, extra memory space, and high latency of remote memory accesses that such dJVM systems require for implementing their memory‐coherence protocol either in isolation or in combination. In order to evaluate the performance‐related benefits of dynamic diffing and lazy home allocation, we implemented both techniques in Cooperative JVM (CoJVM), a basic dJVM system we developed in previous work. In subsequent work, we carried out performance comparisons between the basic CoJVM and modified CoJVM versions for five representative concurrent Java applications (matrix multiply, LU, Radix, fast Fourier transform, and SOR) using our proposed techniques. Our experimental results showed that dynamic diffing and lazy home allocation significantly reduced memory sharing overheads. The reduction resulted in considerable gains in CoJVM system's performance, ranging from 9% up to 20%, in four out of the five applications, with resulting speedups varying from 6.5 up to 8.1 for an 8‐node cluster of computers. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

11.
虚拟化技术是高性能计算系统规模化的关键技术。高能所计算资源虚拟实验床采用 OpenStack 云平台搭建环境。本文讨论了实现虚拟计算资源与计算系统相互融合的三个关键因素:网络架构设计、环境匹配和系统总体规划。本文首先讨论了虚拟网络架构。虚拟化平台通过部署 neutron 组件、OVS以及 802.1Q 协议来实现虚拟网络和物理网络的二层直连,通过配置物理交换机实现三层转发,避免了数据经过 OpenStack 网络节点转发的瓶颈。其次,虚拟计算资源要融入计算系统,需要与计算系统的各个组件进行信息的动态同步,以满足域名分配、自动化配置以及监视等系统的需要。文章介绍了自主开发的 NETDB 组件,该组件负责实现包括虚拟机与域名系统 (DNS)、自动化安装和管理系统 (puppet) 以及监视系统的信息动态同步等功能;最后,在系统总体规划中,文章讨论了包括统一认证、共享存储、自动化部署、规模扩展和镜像等内容。  相似文献   

12.
The service‐oriented architecture paradigm can be exploited for the implementation of data and knowledge‐based applications in distributed environments. The Web services resource framework (WSRF) has recently emerged as the standard for the implementation of Grid services and applications. WSRF can be exploited for developing high‐level services for distributed data mining applications. This paper describes Weka4WS, a framework that extends the widely used open source Weka toolkit to support distributed data mining on WSRF‐enabled Grids. Weka4WS adopts the WSRF technology for running remote data mining algorithms and managing distributed computations. The Weka4WS user interface supports the execution of both local and remote data mining tasks. On every computing node, a WSRF‐compliant Web service is used to expose all the data mining algorithms provided by the Weka library. The paper describes the design and implementation of Weka4WS using the WSRF libraries and services provided by Globus Toolkit 4. A performance analysis of Weka4WS for executing distributed data mining tasks in different network scenarios is presented. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

13.
Fault-tolerant grid architecture and practice   总被引:10,自引:0,他引:10       下载免费PDF全文
Grid computing emerges as effective technologies to couple geographically dis-tributed resources and solve large-scale computational problems in wide area networks. The fault tolerance is a significant and complex issue in grid computing systems. Various techniques have been investigated to detect and correct faults in distributed computing systems. Unreliable fault detection is one of the most effective techniques. Globus as a grid middleware manages resources in a wide area network. The Globns fault detection service uses the well-known techniques basedon unreliable fault detectors to detect and report component failures. However, more powerful techniques are required to detect and correct both system-level and application-level faults in agrid system, and a convenient toolkit is also needed to maintain the consistency in the grid. Afault-tolerant grid platform (FTGP) based on an unreliable fault detector and the Globus faultdetection service is presented in this paper. The platform offers effective strategies in such threeaspects as grid key components, user tasks, and high-level applications.  相似文献   

14.
As large data centers emerge, which host multiple Web applications, it is critical to isolate different application environments for security reasons and to provision shared resources effectively and efficiently to meet different service quality targets at minimum operational cost. To address this problem, we developed a novel architecture of resource management framework for multi-tier applications based on virtualization mechanisms. Key techniques presented in this paper include (1) establishment of the analytic performance model which employs probabilistic analysis and overload management to deal with non-equilibrium states; (2) a general formulation of the resource management problem which can be solved by incorporating both deterministic and stochastic optimizing algorithms; (3) deployment of virtual servers to partition resource at a much finer level; and (4) investigation of the impact of the failure rate to examine the effect of application isolation. Simulation experiments comparing three resource allocation schemes demonstrate the advantage of our dynamic approach in providing differentiated service qualities, preserving QoS levels in failure scenarios and also improving the overall performance while reducing the resource usage cost.  相似文献   

15.
Traditional resource management techniques (resource allocation, admission control and scheduling) have been found to be inadequate for many shared Grid and distributed systems, that consist of autonomous and dynamic distributed resources contributed by multiple organisations. They provide no incentive for users to request resources judiciously and appropriately, and do not accurately capture the true value, importance and deadline (the utility) of a user’s job. Furthermore, they provide no compensation for resource providers to contribute their computing resources to shared Grids, as traditional approaches have a user-centric focus on maximising throughput and minimising waiting time rather than maximising a providers own benefit. Consequently, researchers and practitioners have been examining the appropriateness of ‘market-inspired’ resource management techniques to address these limitations. Such techniques aim to smooth out access patterns and reduce the chance of transient overload, by providing a framework for users to be truthful about their resource requirements and job deadlines, and offering incentives for service providers to prioritise urgent, high utility jobs over low utility jobs. We examine the recent innovations in these systems (from 2000–2007), looking at the state-of-the-art in price setting and negotiation, Grid economy management and utility-driven scheduling and resource allocation, and identify the advantages and limitations of these systems. We then look to the future of these systems, examining the emerging ‘Catallaxy’ market paradigm. Finally we consider the future directions that need to be pursued to address the limitations of the current generation of market oriented Grids and Utility Computing systems.  相似文献   

16.
普适计算环境中,传感器设备的大规模使用产生了数量巨大、错综复杂的原子事件,而现实中的许多应用却更注重复合事件的检测,例如:健康护理、监督设施管理、环境/安全监控等,因此如何从这些底层的原子事件中抽取人们感兴趣的、有用的复合事件就变得越来越重要。目前,针对复合事件检测有大量的研究,其内容各有侧重。有的重视时间因素,特别强调时间段的重要性;有的研究分布式数据源中的复合事件检测;近期有人提出了不确定性数据上的复合事件检测。由于复合事件检测日益重要,对复合事件检测研究中存在的挑战性问题进行了分析,从事件类型、时间因素和数据的精确程度3个方面归纳总结了复合事件检测现有的研究成果,并指出了未来的发展方向。  相似文献   

17.
Computational grids that couple geographically distributed resources such as PCs, workstations, clusters, and scientific instruments, have emerged as a next generation computing platform for solving large-scale problems in science, engineering, and commerce. However, application development, resource management, and scheduling in these environments continue to be a complex undertaking. In this article, we discuss our efforts in developing a resource management system for scheduling computations on resources distributed across the world with varying quality of service (QoS). Our service-oriented grid computing system called Nimrod-G manages all operations associated with remote execution including resource discovery, trading, scheduling based on economic principles and a user-defined QoS requirement. The Nimrod-G resource broker is implemented by leveraging existing technologies such as Globus, and provides new services that are essential for constructing industrial-strength grids. We present the results of experiments using the Nimrod-G resource broker for scheduling parametric computations on the World Wide Grid (WWG) resources that span five continents.  相似文献   

18.
基于Internet的设备远程监控技术的研究   总被引:9,自引:1,他引:9  
随着信息技术的不断发展,动态联盟企业的不断出现,设备远程监控技术的研究及 应用正在世界范围内兴起.文章提出了构造设备远程控制技术的的框架结构,介绍了远程监 控系统建设的关键技术,所建立的远程监测系统可用于实现对有关设备的远程信息管理、状 态监测、信息与资源共享.  相似文献   

19.
基于CAN总线的电力远程监管系统主站软件设计   总被引:3,自引:2,他引:1  
电力远程监测管理系统的主站软件设计是系统设计的关键技术。介绍了一种基于CAN总线的电力远程监管系统,给出了系统的主站软件功能及主站软件流程,介绍了通信模块设计,讨论了主站VB与LabVIEW信息资源共享和程序嵌套问题。  相似文献   

20.
介绍云计算的工作原理、与网格计算的区别、云计算的特点及应用模式,简述云计算的关键技术,包括虚拟化技术、数据存储技术、数据管理技术、编程模型技术和资源监控技术,指出云计算应用存在的主要问题。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号