首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 156 毫秒
1.
Fault-tolerant grid architecture and practice   总被引:10,自引:0,他引:10       下载免费PDF全文
Grid computing emerges as effective technologies to couple geographically dis-tributed resources and solve large-scale computational problems in wide area networks. The fault tolerance is a significant and complex issue in grid computing systems. Various techniques have been investigated to detect and correct faults in distributed computing systems. Unreliable fault detection is one of the most effective techniques. Globus as a grid middleware manages resources in a wide area network. The Globns fault detection service uses the well-known techniques basedon unreliable fault detectors to detect and report component failures. However, more powerful techniques are required to detect and correct both system-level and application-level faults in agrid system, and a convenient toolkit is also needed to maintain the consistency in the grid. Afault-tolerant grid platform (FTGP) based on an unreliable fault detector and the Globus faultdetection service is presented in this paper. The platform offers effective strategies in such threeaspects as grid key components, user tasks, and high-level applications.  相似文献   

2.
Fundamentally, semantic grid database is about bringing globally distributed databases together in order to coordinate resource sharing and problem solving in which information is given well-defined meaning, and DartGrid II is the implemented database gird system whose goal is to provide a semantic solution for integrating database resources on the Web. Although many algorithms have been proposed for optimizing query-processing in order to minimize costs and/or response time, associated with obtaining the answer to query in a distributed database system, database grid query optimization problem is fundamentally different from traditional distributed query optimization. These differences are shown to be the consequences of autonomy and heterogeneity of database nodes in database grid. Therefore, more challenges have arisen for query optimization in database grid than traditional distributed database. Following this observation, the design of a query optimizer in DartGrid II is presented, and a heuristic, dynamic and parallel query optimization approach to processing query in database grid is proposed. A set of semantic tools supporting relational database integration and semantic-based information browsing has also been implemented to realize the above vision.  相似文献   

3.
Unified Parallel C(UPC) is a parallel extension of ANSI C based on the Partitioned Global Address Space(PGAS) programming model,which provides a shared memory view that simplifies code development while it can take advantage of the scalability of distributed memory architectures.Therefore,UPC allows programmers to write parallel applications on hybrid shared/distributed memory architectures,such as multi-core clusters,in a more productive way,accessing remote memory by means of different high-level language constructs,such as assignments to shared variables or collective primitives.However,the standard UPC collectives library includes a reduced set of eight basic primitives with quite limited functionality.This work presents the design and implementation of extended UPC collective functions that overcome the limitations of the standard collectives library,allowing,for example,the use of a specific source and destination thread or defining the amount of data transferred by each particular thread.This library fulfills the demands made by the UPC developers community and implements portable algorithms,independent of the specific UPC compiler/runtime being used.The use of a representative set of these extended collectives has been evaluated using two applications and four kernels as case studies.The results obtained confirm the suitability of the new library to provide easier programming without trading off performance,thus achieving high productivity in parallel programming to harness the performance of hybrid shared/distributed memory architectures in high performance computing.  相似文献   

4.
Both resource efficiency and application QoS have been big concerns of datacenter operators for a long time,but remain to be irreconcilable.High resource utilization increases the risk of resource contention between co-located workload,which makes latency-critical(LC)applications suffer unpredictable,and even unacceptable performance.Plenty of prior work devotes the effort on exploiting effective mechanisms to protect the QoS of LC applications while improving resource efficiency.In this paper,we propose MAGI,a resource management runtime that leverages neural networks to monitor and further pinpoint the root cause of performance interference,and adjusts resource shares of corresponding applications to ensure the QoS of LC applications.MAGI is a practice in Alibaba datacenter to provide on-demand resource adjustment for applications using neural networks.The experimental results show that MAGI could reduce up to 87.3%performance degradation of LC application when co-located with other antagonist applications.  相似文献   

5.
Recent advances in connected vehicles and autonomous driving are going to change the face of ground trans- portation as we know it. This paper describes the design and evaluation of several emerging applications for such a cyber transportation system (CTS). These applications have been designed using holistic approaches, which consider the unique roles played by the human drivers, the transportation system, and the communication network. They can improve driver safety and provide on-road infotainment. They can also improve transportation operations and efficiency, thereby benefiting travelers and attracting investment from both government agencies and private businesses to deploy infrastructures and bootstrap the evolutionary process of CTS.  相似文献   

6.
Devices based on the optical microcavities which confine light to small volumes by resonant recircu- lation are already indispensable for a wide range of studies and applications. This article provides an overview of the development and application of optical microcavities. We first give a pedagogical introduction to the in- teraction between a two-level system and a quantized electromagnetic field in the cavity, based on the so-called Jaynes-Cummings model, which is basic and important theory model in the cavity quantum electrodynamics, and various quantum phenomena and applications of it. Then, we overview three basic types of the microcavity structures, and also highlight the progress achieved so far in these systems. Based on these three structures, we give an account of three representative applications of optical microcavities, and explain their microcavity requirements and the state of the art for these devices, before outlining the challenges for the future.  相似文献   

7.
Recent Advances in Evolutionary Computation   总被引:17,自引:0,他引:17       下载免费PDF全文
Evolutionary computation has experienced a tremendous growth in the last decade in both theoretical analyses and industrial applications. Its scope has evolved beyond its original meaning of "biological evolution" toward a wide variety of nature inspired computational algorithms and techniques, including evolutionary, neural, ecological, social and economical computation, etc, in a unified framework. Many research topics in evolutionary computation nowadays are not necessarily "evolutionary". This paper provides an overview of some recent advances in evolutionary computation that have been made in CERCIA at the University of Birmingham, UK. It covers a wide range of topics in optimization, learning and design using evolutionary approaches and techniques, and theoretical results in the computational time complexity of evolutionary algorithms. Some issues related to future development of evolutionary computation are also discussed.  相似文献   

8.
Efficiency of batch processing is becoming increasingly important for many modern commercial service centers, e.g., clusters and cloud computing datacenters. However, periodical resource contentions have become the major performance obstacles for concurrently running applications on mainstream CMP servers. I/O contention is such a kind of obstacle, which may impede both the co-running performance of batch jobs and the system throughput seriously. In this paper, a dynamic I/O-aware scheduling algorithm is proposed to lower the impacts of I/O contention and to enhance the co-running performance in batch processing. We set up our environment on an 8-socket, 64-core server in Dawning Linux Cluster. Fifteen workloads ranging from 8 jobs to 256 jobs are evaluated. Our experimental results show significant improvements on the throughputs of the workloads, which range from 7% to 431%. Meanwhile, noticeable improvements on the slowdown of workloads and the average runtime for each job can be achieved. These results show that a well-tuned dynamic I/O-aware scheduler is beneficial for batch-mode services. It can also enhance the resource utilization via throughput improvement on modern service platforms.  相似文献   

9.
Ethemet networks have undergone impressive growth since the past few decades. This growth can be appreciated in terms of the equipment, such as switches and links, that have been added, as well as in the number of users that it supports. In parallel to this expansion, over the past decade the networking research community has shown a growing interest in discovering and analyzing the Ethernet topology. Research in this area has concentrated on the theoretical analysis of Ethemet topology as well as developing tools and methods for mapping the network layout. These efforts have brought us to a crucial juncture for Ethernet topology measurement infrastructures: while, previously, these were both small (in terms of number of measurement points), people are starting to see the deployment of large-scale distributed systems composed of hundreds or thousands of monitors. As all look forward to this next generation of systems, all take stock of what has been achieved so far. In this survey, the authors discuss past and current mechanisms for discovering the Ethernet topology from theoretical and practical prospective. In addition to discovery techniques, the authors provide insights into some of the well-known open issues related to Ethernet topology discovery.  相似文献   

10.
Technology enhancements and the growing breadth of application workflows running on high-performance computing(HPC)platforms drive the development of new data services that provide high performance on these new platforms,provide capable and productive interfaces and abstractions for a variety of applications,and are readily adapted when new technologies are deployed.The Mochi framework enables composition of specialized distributed data services from a collection of connectable modules and subservices.Rather than forcing all applications to use a one-size-fits-all data staging and I/O software configuration,Mochi allows each application to use a data service specialized to its needs and access patterns.This paper introduces the Mochi framework and methodology.The Mochi core components and microservices are described.Examples of the application of the Mochi methodology to the development of four specialized services are detailed.Finally,a performance evaluation of a Mochi core component,a Mochi microservice,and a composed service providing an object model is performed.The paper concludes by positioning Mochi relative to related work in the HPC space and indicating directions for future work.  相似文献   

11.
The Globus project: a status report   总被引:8,自引:0,他引:8  
The Globus project is a multi-institutional research effort that seeks to enable the construction of computational grids providing pervasive, dependable, and consistent access to high-performance computational resources, despite geographical distribution of both resources and users. Computational grid technology is being viewed as a critical element of future high-performance computing environments that will enable entirely new classes of computation-oriented applications, much as the World Wide Web fostered the development of new classes of information-oriented applications. In this paper, we report on the status of the Globus project as of early 1998. We describe the progress that has been achieved to date in the development of the Globus toolkit, a set of core services for constructing grid tools and applications. We also discuss the Globus Ubiquitous Supercomputing Testbed Organization (GUSTO) that we have constructed to enable large-scale evaluation of Globus technologies, and we review early experiences with the development of large-scale grid applications on the GUSTO testbed.  相似文献   

12.
The advent of service-oriented architectures in Grid environments has fostered the development of applications in distributed deployments. The Globus Toolkit 4 (GT4) and its implementation of stateful Web services, via the WS-Resource Framework (WSRF), is a suitable platform to develop these Grid services. This way, its increased usage in many scientific areas reveals new scenarios where fault-tolerance and high availability should be considered. This paper describes a library that manages the automatic replication of WSRF-based Grid services. This functionality can be plugged to existing Grid services, by means of minimal changes in its source code, to achieve state replication through WS-Resources. The architecture of the library and its performance evaluation are described. In particular, two different replica topologies are addressed: ring-based and leaf-to-root complete binary tree, in order to achieve resource state update in logarithmic time with respect to the number of replicas. Finally, the paper describes the integration of the replication library into a service-oriented metascheduler to enhance fault-tolerance and to guarantee service availability.  相似文献   

13.
Emerging Web-based applications require distributed multimedia information system (DMIS) infrastructures. Examples of such applications abound in the domains of medicine, entertainment, manufacturing, e-commerce, as well as military and critical national infrastructures. Development of DMIS for such applications need a broad range of technological solutions for organizing, storing, and delivering multimedia information in an integrated, secure and timely manner with guaranteed end-to-end (E2E) quality of presentation (QoP). DMIS are viewed as catalysts for new research in many areas, ranging from basic research to applied technology. This view is a result of the fact that no single monolithic end-to-end architecture for DMIS can meet the wide spectrum of characteristics and requirements of various Web-based multimedia applications. One size does not fit all in this medium of communication. Management of integrated end-to-end QoP and ensuring information security in DMIS, when viewed in conjunction with real world constraints and system-wide performance requirements, present formidable research and implementation challenges. These challenges encompass all the sub-system components of a DMIS. The ultimate objective of achieving a comprehensive end-to-end QoP management relies on the performance and allocation of resources of each of the DMIS sub-system components including networks, databases, and end-systems. In this paper, we elaborate on these challenges and present a high level distributed architecture aimed at providing the critical functionality for a DMIS.
Arif GhafoorEmail:
  相似文献   

14.
When parallel applications are run in large‐scale distributed environments, such as grids, peer‐to‐peer (P2P) systems, and clouds, the set of resources used can change dynamically as machines crash, reservations end, and new resources become available. It is vital for applications to respond to these changes. Therefore, it is necessary to keep track of the available resources—a problem which is known to be notoriously difficult. In this article we argue that resource tracking must be provided as the standard functionality in the lower parts of the software stack. We propose a general solution to resource tracking: the Join–Elect–Leave (JEL) model. JEL provides unified resource tracking for parallel and distributed applications across environments. JEL is a simple yet powerful model based on notifying when resources have Joined or Left the computation. We demonstrate that JEL is suitable for resource tracking in a wide variety of programming models, ranging from the fixed resource sets traditionally used in MPI‐1 to flexible grid‐oriented programming models. We compare several JEL implementations, and show these to perform and scale well in several real‐world scenarios involving grids, clouds and P2P systems applied concurrently, and wide‐area systems with failing resources. Using JEL, we have won the first prize in a number of international distributed computing competitions. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

15.
Most health-related issues such as public health outbreaks and epidemiological threats are better understood from a spatial–temporal perspective and, clearly demand related geospatial datasets and services so that decision makers may jointly make informed decisions and coordinate response plans. Although current health applications support a kind of geospatial features, these are still disconnected from the wide range of geospatial services and datasets that geospatial information infrastructures may bring into health. In this paper we are questioning the hypothesis whether geospatial information infrastructures, in terms of standards-based geospatial services, technologies, and data models as operational assets already in place, can be exploited by health applications for which the geospatial dimension is of great importance. This may be certainly addressed by defining better collaboration strategies to uncover and promote geospatial assets to the health community. We discuss the value of collaboration, as well as the opportunities that geographic information infrastructures offer to address geospatial challenges in health applications.  相似文献   

16.
Globus Nexus is a professionally hosted Platform-as-a-Service that provides identity, profile and group management functionality for the research community. Many collaborative e-Science applications need to manage large numbers of user identities, profiles, and groups. However, developing and maintaining such capabilities is often challenging given the complexity of modern security protocols and requirements for scalable, robust, and highly available implementations. By outsourcing this functionality to Globus Nexus, developers can leverage best-practice implementations without incurring development and operations overhead. Users benefit from enhanced capabilities such as identity federation, flexible profile management, and user-oriented group management. In this paper we present Globus Nexus, describe its capabilities and architecture, summarize how several e-Science applications leverage these capabilities, and present results that characterize its scalability, reliability, and availability.  相似文献   

17.
Recently scientific communities produce a growing number of computation-intensive applications, which calls for the interoperation of distributed infrastructures including Clouds, Grids and private clusters. The European SHIWA and ER-flow projects have enabled the combination of heterogeneous scientific workflows, and their execution in a large-scale system consisting of multiple Distributed Computing Infrastructures. One of the resource management challenges of these projects is called parameter study job scheduling. A parameter study job of a workflow generally has a large number of input files to be consumed by independent job instances. In this paper we propose a meta-brokering framework for science gateways to support the execution of such workflows. In order to cope with the high uncertainty and unpredictable load of the utilized distributed infrastructures, we introduce the so called resource priority services. These tools are capable of determining and dynamically updating priorities of the available infrastructures to be selected for job instances. Our evaluations show that this approach implies an efficient distribution of job instances among the available computing resources resulting in shorter makespan for parameter study workflows.  相似文献   

18.
During the last decade, the number of distributed application domains with temporal requirements has significantly augmented, arising the necessity of exploring new concepts and paradigms that allow, on the one hand, the development of dynamic and flexible distributed applications and, on the other hand, the reusability of code. Service‐oriented paradigms have been successfully applied to distributed environments, increasing their flexibility and allowing the reusability of their components. Besides, distributed real‐time Java technologies have shown to be a good candidate to deploy real‐time distributed applications. This paper presents a model for service‐oriented applications on a time‐triggered distributed real‐time Java environment, focusing on the definition of the temporal model of an application and its schedulability, applying and evaluating this model in real‐time service‐oriented composition algorithms. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

19.
Scheduling constitutes an integral feature of Grid computing infrastructures, being also a key to realizing several of the Grid promises. In particular, scheduling can maximize the resources available to end users, accelerate the execution of jobs, while also supporting scalable and autonomic management of the resources comprising a Grid. Grid scheduling functionality hinges on middleware components called meta-schedulers, which undertake to automatically distribute jobs across the dispersed heterogeneous resources of a Grid. In this paper we present the design and implementation of a Grid meta-scheduler, which we call EMPEROR. EMPEROR provides a framework for implementing scheduling algorithms based on performance criteria. In implementing a particular instantiation of this framework, we have devised models for predicting host load and memory resources, and accordingly for estimating the running time of a task. These models hinge on time series analysis techniques and take into account results of the cluster computing literature. Apart from incorporating these models, EMPEROR provides fully fledged Grid scheduling functionality, which complies with OGSA standards as the later are reflected in the Globus toolkit. Specifically, EMPEROR interfaces to Globus middleware services (i.e., GSI, MDS, GRAM) towards discovering resources, implementing the scheduling algorithm and ultimately submitting jobs to local scheduling systems. By and large, EMPEROR is one of the few standards based meta-schedulers making use of dynamic scheduling information.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号