首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This paper describes the design and implementation of a kernel for the distributed programming language StarMod. The distributed programming kernel was written in a subset of StarMod supported by a concurrent programming kernel. Kernel issues addressed include process representation, I/O device management, signal semantics, system utilities, network communication and the implementation of high-level language communication primitives. We conclude with a summary of our experiences in the development of a ‘bare machine’ kernel for a network of microprocessors.  相似文献   

2.
底层通信协议中内存映射机制的设计与实现*   总被引:4,自引:1,他引:3  
在底层网络通信协议中使用内存映射机制为用户层应用提供了虚拟网络界面,使用户层能够方便地访问快速通信设备;通过减少系统软件的协议处理开销,有效地减少了网络通信的延迟.讨论了通信协议中的内存映射机制的设计思想和实现过程,提出了通信区的概念,利用通信区有效地完成核心与用户之间的数据交换.同时给出一个实例,对其实现与性能进行了分析.  相似文献   

3.
A Distributed Shared Memory (DSM) system provides a distributed application with a shared virtual address space. This article proposes a design for implementing the DSM communication layer on top of the Virtual Interface Architecture (VIA), an industry standard for user‐level networking protocols on high‐speed clusters. User‐level communication protocols operate in user mode, thus removing the operating system kernel's overhead from the critical communication pass, and significantly diminishing communication overhead as a result. We analyze VIA's facilities and limitations in order to ascertain which implementation trade‐offs can be best applied to our development of an efficient communication substrate optimized for DSM requirements. We then implement a multithreaded version of the Home‐based Lazy Release Consistency (HLRC) protocol on top of this substrate. In addition, we compare the performance of this HLRC protocol with that of the Sequential Consistency (SC) protocol in which a Multi View (MV) memory mapping technique was used. This technique enables a fine‐grained access to shared memory, while still relying on the virtual memory hardware to track memory accesses. We perform an ‘apple‐to‐apple’ comparison on the same testbed environment and benchmark suite, and investigate the effectiveness and scalability of both protocols. Copyright © 2005 John Wiley & Sons, Ltd.  相似文献   

4.
通过信能不高是影响软件分布式共享存储系统性能的主要因素之一,用户级通信技术能够充分发挥高速网络的硬件性能,减少数据拷贝次数,降低软件件开发销,明显改善了带宽和延迟,为软件分布式共享存储系统性能的提高开避了新的途径,设计并实现了一个面向软件分布式存储系统的用户级通信库,它不仅改善了系统的通禽性能,同时也使得系统的并行计算性能得到改善,从而十分显著地提高了软件分布式共享存储系统的整体性能。  相似文献   

5.
High-level parallel programming models supporting dynamic fine-grained threads in a global object space are becoming increasingly popular for expressing irregular applications based on sophisticated adaptive algorithms and pointer-based data structures. However, implementing these multithreaded computations on scalable parallel machines poses significant challenges, particularly with respect to object caching. Object caching techniques must be able to tolerate unresponsive processors and protocol handler occupancy delays. This paper examines whether these challenges can be offset by leveraging responsive general-purpose communication architectural features (such as remote memory access and atomic operations), possibly compensating for the lack of more sophisticated hardware primitives by relying upon increased involvement of the run-time system and the compiler. A detailed performance analysis of four irregular applications, using the Illinois Concert System on the Cray T3D and the SGI Origin 2000, finds that existing software distributed shared memory (DSM) systems are capable of delivering good performance only in the presence of a high level of responsive communication architecture support (specifically, support for remote atomic operations). Recognizing that this situation stems from the synchronous request–reply nature of DSM protocols, we present a composable object caching framework, called view caching, which exploits knowledge of application data access semantics to construct custom protocols that require reduced processor synchronization. View caching protocols are more tolerant to responsiveness and occupancy delays and are able to exploit even lower level responsive communication primitives (such as nonatomic remote memory accesses) for a performance benefit.  相似文献   

6.
Traditional software Distributed Shared Memory (DSM) systems rely on the virtual memory management mechanisms to detect accesses to shared memory locations and maintain their consistency. The resulting involvement of the OS (kernel) and the associated overhead which is significant, can be avoided by careful compile time analysis and code instrumentation. In this paper, we propose such a Compiler Assisted Software support approach (CAS-DSM). In the CAS-DSM implementation, the involvement of the OS kernel is avoided by instrumenting the application code at the source level. The overhead caused by the execution of the instrumented code is reduced through several aggressive compile time optimizations. Finally, we also address the issue of reducing certain overheads in polling-based implementation of receiving asynchronous messages. We used SUIF, a public domain compiler tool, to implement compile time analysis, instrumentation and optimizations. We modified CVM, a publicly available software DSM to support the instrumentation inserted by the compiler. Detailed performance evaluation of CAS-DSM is reported using a set of Splash/Splash2 parallel application benchmarks on a distributed memory IBM SP-2 machine. CAS-DSM achieved moderate to good performance improvements for most of the applications compared to the original CVM implementation. Reducing the overheads in polling-based implementation improves the performance of CAS-DSM significantly resulting in an overall improvement of 12–52% over the original CVM implementation.  相似文献   

7.
A retrospective view is presented of the Charlotte distributed operating system, a testbed for developing techniques and tools to solve computation-intensive problems with large-grain parallelism. The final version of Charlotte runs on the Crystal multicomputer, a collection of VAX-11/750 computers connected by a local area network. The kernel/process interface is unique in its support for symmetric, bidirectional communication paths (called links), and synchronous nonblocking communications. Several lessons were learned in implementing Charlotte. Links have proven to be a useful abstraction, but the primitives do not seem to be at quite the right level of abstraction. The implementation uses finite-state machines and a multitask kernel, both of which work well. It also maintains absolute distributed information which is more expensive that using hints. The development of high-level tools, particularly the Lynx distributed programming language, has simplified the use of kernal primitives and helps to manage concurrency at the process level  相似文献   

8.
9.
Lee  I. King  R.B. Paul  R.P. 《Computer》1989,22(6):78-83
The authors present a real-time kernel developed to support a distributed multisensor system encountered in robotics applications. To ensure predictability, the kernel provides services with bounded worst-case execution times. In addition, the kernel allows the programmer to specify timing constraints for process execution and interprocess communication. The kernel uses these timing constraints both for scheduling processes and for scheduling communications. To illustrate the kernel, the authors describe a multisensor system being developed on their distributed real-time system. They present the measured performance of kernel primitives along with conclusions and remarks regarding distributed real-time systems  相似文献   

10.
With the increasing proliferation of computer networks and distributed systems, there is a growing number of applications using multicast communication. This paper presents the Vartalaap system developed at IIT, Bombay. Vartalaap is an hierarchical distributed system for multicast communication over a network, implemented in a hardware-independent fashion. The multicast is achieved without resorting to unnecessary broadcasting of messages over the network. Issues covered in this paper include the primitives for multicast, the multicast model and the system architecture. We discuss the implementation of Vartalaap and compare it with some other systems. We conclude with a discussion on the limitations of the current implementation and directions for future work.  相似文献   

11.
This paper describes the design, implementation, and performance of ES-Kit, a distributed object-oriented system being developed by the Experimental Systems Project at the Microelectronics and Computer Technology Corporation. The operating system consists of a kernel and a set of Public Service Objects which dynamically extend the functionality of the kernel by providing several traditional operating system services when required by application objects. Applications for the ES-Kit environment are written in GNU C++ and do not require additional language primitives for distributed execution. Initial performance results from a representative set of applications indicate that the object-oriented paradigm provides a powerful solution to distributed programming.  相似文献   

12.
13.
Super-Object模型提出了一种新的方法,在分布存储器多计算机上实现语言级虚拟共享存储器以支持共享存储器通信模式.Super-Object模型引入新的概念super-object,不同于其它模型,基于super-object,它提出了新的共享数据定位方法,全局地址标识(name,off-set).Super-Object模型与Fortran77结合,我们实现了一个运行时间系统和库调用,支持程序员使用Fortran语言编写并行程序,最后介绍了系统的实现和取得的性能.  相似文献   

14.
The emergence and standardization of system area networks (SANs) has provided distributed applications with a medium for high‐bandwidth, low‐latency communication. Standard user‐level networking architecture such as the Virtual Interface (VI) Architecture enables distributed applications to perform low overhead communication over SANs. The VI Architecture significantly reduces system processing overheads and provides each consumer process with a protected, directly accessible interface to the network hardware. Developing distributed applications using low‐level primitives provided by user‐level networking architecture like the VI Architecture is complex and requires significant effort. This paper describes how high‐level communication paradigms like stream sockets and remote procedure call (RPC) can be efficiently built over the VI Architecture. To evaluate performance benefits for standard client–server and multi‐threaded environments, our focus is on off‐the‐shelf sockets and RPC interfaces and commercially available VI Architecture‐based SANs. The key design techniques developed in this paper include credit‐based flow control, decentralized user‐level protocol processing, caching of pinned communication buffers, and deferred processing of completed send operations. In the experimental evaluation, the one‐way bandwidth achieved by stream sockets over VI Architecture was three to four times better than the bandwidth achieved by running legacy protocols over the same interconnect. On the same SAN, high‐performance stream sockets and RPC over VI Architecture achieve significantly better (between 2× and 3× less) latency than conventional stream sockets and RPC over standard networking protocols in a Windows NT? 4.0 environment. Furthermore, our high‐performance RPC transparently improved the network performance of the distributed component object model (DCOM) by a factor of two to three. Copyright © 2001 John Wiley & Sons, Ltd.  相似文献   

15.
16.
介绍了一种基于零拷贝思想的用户级通信协议的设计与实现。通过对传统操作系统在处理网络数据包的过程中多次拷贝而造成的延迟进行了仔细分析,设计了一种内存映射机制,使用户的应用程序避开了操作系统核心的干预,直接与网络接口进行交互,并有效地完成操作系统核心与用户之间的数据交换,从而地减少了网络通信的开销与延迟。  相似文献   

17.
We describe the evolution of a distributed shared memory (DSM) system, Mirage, and the difficulties encountered when moving the system from a Unix-based* kernel on the VAX to a Unix-based kernel on personal computers. Mirage provides a network transparent form of shared memory for a loosely coupled environment. The system hides network boundaries for processes that are accessing shared memory and is upward compatible with the Unix System V Interface Definition. This paper addresses the architectural dependencies in the design of the system and evaluates performance of the implementation. The new version, MIRAGE +, performs well compared to Mirage even though eight times the amount of data is sent on each page fault because of the larger page size used in the implementation. We show that performance of systems with a large page size to network packet size can be dramatically improved on conventional hardware by applying three well-known techniques: packet blasting, compression, and running at interrupt level. The measured time for a page fault in MIRAGE + has been reduced 37 per cent by sending a page using packet blasting instead of using a handshake for each portion of the page. When compression was added to MIRAGE +, the time to fault a page across the network was further improved by 47 per cent when the page was compressed into one network packet. Our measured performance compares favorably with the amount of time it takes to fault a page from disk. Lastly, running at interrupt level may improve performance 16 per cent when faulting pages without compression.  相似文献   

18.
Distributed systems that consist of workstations connected by high performance interconnects offer computational power comparable to moderate size parallel machines. Middleware like distributed shared memory (DSM) or distributed shared objects (DSO) attempts to improve the programmability of such hardware by presenting to application programmers interfaces similar to those offered by shared memory machines. This paper presents the portable Indigo data sharing library which provides a small set of primitives with which arbitrary shared abstractions are easily and efficiently implemented across distributed hardware platforms. Sample shared abstractions implemented with Indigo include DSM as well as fragmented objects, where the object state is split across different machines and where interfragment communications may be customized to application-specific consistency needs. The Indigo library's design and implementation are evaluated on two different target platforms: a workstation cluster and an IBM SP2 machine. As part of this evaluation, a novel DSM system and consistency protocol are implemented and evaluated with several high performance applications. Application performance attained with the DSM system is compared to the performance experienced when utilizing the underlying basic message-passing facilities or when employing Indigo to construct customized fragmented objects implementing the application's shared state. Such experimentation results in insights concerning the efficient implementation of DSM systems (e.g. how to deal with false sharing). It also leads to the conclusion that Indigo provides a sufficiently rich set of abstractions for efficient implementation of the next generation of parallel programming models for high performance machines. © 1998 John Wiley & Sons, Ltd.  相似文献   

19.
This paper discusses a parallel Lisp system developed for a distributed memory, parallel processor, the Mayfly. The language has been adapted to the problems of distributed data by providing a tight coupling of control and data, including mechanisms for mutual exclusion and data sharing. The language was primarily designed to execute on the Mayfly, but also runs on networked workstations. Initially, we show the relevant parts of the language as seen by the user. Then we concentrate on the system Lisp level implementation of these constructs with particular attention toagents, a mechanism for limiting the cost of remote operations. Briefly mentioned are the low-level kernel hardware and software support of the system Lisp primitives.Work Supported in part by the Hewlett-Packard Corporation.  相似文献   

20.
Visual Grid Workflow in Triana   总被引:1,自引:0,他引:1  
In this paper, we describe the graphical abstractions for Grids and services that have been implemented within the Triana problem solving environment. We provide an overview of the ways in which Triana interacts with services (e.g., Web and P2P services) and then how we interact with core Grid components, such as resource managers and data management systems through the extensive use of the GridLab GAT interface. We describe in detail the GAT philosophy and implementation and then show how the various GAT primitives can be represented in an intuitive fashion within a Triana workflow. This approach, which we refer to as the Visual GAT, differs substantially from other approaches because we do not tie our implementation to any specific underlying Grid middleware technologies; rather, we base our implementation on application level requirements and model such primitives from a user’s perspective by hiding as much complexity as possible without undermining the core capabilities required. We provide a use case to demonstrate the Visual GAT implementation and show how legacy applications can seamlessly be distributed and integrated in a dynamic fashion within complex data-driven workflow scenarios.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号