期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

张继超常迪郑纬民沈美明《小型微型计算机系统》2004,25(1):30-34

用户态通信允许应用程序从应用层直接访问网络接口，主机与网络接口之间的数据传输模式对协议性能有重要影响．有效的数据传输模式可以减少数据拷贝次数，降低数据传输开销，尽可能将网络硬件的高性能反映到用户层．本文详细分析了通信系统开销来源，讨论了Myrinet网络环境下不同数据传输模式的实现与特点，测试并分析了不同数据传输模式对用户态通信性能的影响，并给出了相应的适用环境．相似文献

2.

基于Myrinet的用户空间精简协议 总被引：5，自引：0，他引：5

董春雷郑纬民《软件学报》1999,10(3):299-303

通信子系统是影响工作站机群系统整体性能的主要因素.文章在分析和比较了3种常用的网络性能之后,指出上层协议的处理是影响工作站机群系统性能的主要瓶颈.在由640Mbps的Myrinet连接的8台Sun SPARC工作站组成的机群系统上实现了一个用户层的高性能的精简通信协议——RCP(reduced communication protocol).通过精简协议的冗余功能、减少数据拷贝次数和直接操作硬件缓冲区等方法,达到低延迟、高效率.RCP的回路延迟时间比TCP/IP小得多（200μs vs 1 540μs）, 相似文献

3.

Exploiting NIC architectural support for enhancing IP-based protocols on high-performance networks

《Journal of Parallel and Distributed Computing》2005,65(11):1348-1365

While a number of user-level protocols have been developed to reduce the gap between the performance capabilities of the physical network and the performance actually available, their compatibility issues with the existing sockets-based applications and IP-based infrastructure has been an area of major concern. To address these compatibility issues while maintaining a high performance, a number of researchers have been looking at alternative approaches to optimize the existing traditional protocol stacks. Broadly, previous research has broken up the overheads in the traditional protocol stack into four related aspects, namely: (i) compute requirements and contention, (ii) memory contention, (iii) I/O bus contention and (iv) system resources’ idle time. While previous research dealing with some of these aspects exists, to the best of our knowledge, there is no work which deals with all these issues in an integrated manner while maintaining backward compatibility with existing applications and infrastructure. In this paper, we address each of these issues, propose solutions for minimizing these overheads by exploiting the emerging architectural features provided by modern Network Interface Cards (NICs) and demonstrate the capabilities of these solutions using an implementation based on UDP/IP over Myrinet. Our experimental results show that with our implementation of UDP, termed as E-UDP, can achieve up to 94% of the theoretical maximum bandwidth. We also present a mathematical performance model which allows us to study the scalability of our approach for different system architectures and network speeds. 相似文献

4.

Evolution of the virtual interface architecture

von Eicken T. Vogels W. 《Computer》1998,31(11):61-68

To provide a faster path between applications and the network, researchers have advocated removing the operating system kernel and its centralized networking stack from the critical path and creating a user level network interface. With these interfaces, designers can tailor the communication layers each process uses to the demands of that process. Consequently, applications can send and receive network packets without operating system intervention, which greatly decreases communication latency and increases network throughput. Unfortunately, the diversity of approaches and lack of consensus has stalled progress in refining research results into products-a prerequisite to the widespread adoption of these interfaces. Recently, however, Intel, Microsoft, and Compaq have introduced the Virtual Interface Architecture, an emerging standard for cluster or system area networks. Products based on the VIA have already surfaced, notably GigaNet's GNN1000 network interface. As more products appear, research into application level issues can proceed and the technology of user level network interfaces should mature. Several prototypes-among them Cornell University's U-Net2-have heavily influenced the VIA. We describe the architectural issues and design trade-offs at the core of these prototype designs 相似文献

5.

一种高效的用户级通信协议的研究与实现

李斌辛海红胡铭曾《计算机工程》2006,32(1):148-150

介绍了一种基于零拷贝思想的用户级通信协议的设计与实现。通过对传统操作系统在处理网络数据包的过程中多次拷贝而造成的延迟进行了仔细分析,设计了一种内存映射机制,使用户的应用程序避开了操作系统核心的干预,直接与网络接口进行交互,并有效地完成操作系统核心与用户之间的数据交换,从而地减少了网络通信的开销与延迟。相似文献

6.

Client-server computing on Shrimp

Damianakis S.N. Biles A. Dubnicki C. Felten E.W. 《Micro, IEEE》1997,17(1):8-18

Technological advances in network and processor speeds do not lead to equally large improvements in the performance of client-server systems. For instance, hardware performance improvements do not translate into faster user applications. This is primarily because software overhead dominates communication. The Shrimp project at Princeton University seeks solutions to this problem. Shrimp (Scalable High-Performance Really Inexpensive Multiprocessor) supports protected user-level communication between processes by mapping memory pages between virtual address spaces. This virtual memory-mapped network interface has several advantages, including flexible user-level communication and very low overhead for initiating data transfers. Here, we examine two remote procedure call (RPC) protocols and one socket implementation for Shrimp that deliver almost undiminished hardware performance to user applications 相似文献

7.

Design and implementation of efficient communication abstractions on the Virtual Interface Architecture: Stream sockets and RPC experience

Hemal V. Shah Rajesh S. Madukkarumukumana 《Software》2001,31(11):1043-1065

The emergence and standardization of system area networks (SANs) has provided distributed applications with a medium for high‐bandwidth, low‐latency communication. Standard user‐level networking architecture such as the Virtual Interface (VI) Architecture enables distributed applications to perform low overhead communication over SANs. The VI Architecture significantly reduces system processing overheads and provides each consumer process with a protected, directly accessible interface to the network hardware. Developing distributed applications using low‐level primitives provided by user‐level networking architecture like the VI Architecture is complex and requires significant effort. This paper describes how high‐level communication paradigms like stream sockets and remote procedure call (RPC) can be efficiently built over the VI Architecture. To evaluate performance benefits for standard client–server and multi‐threaded environments, our focus is on off‐the‐shelf sockets and RPC interfaces and commercially available VI Architecture‐based SANs. The key design techniques developed in this paper include credit‐based flow control, decentralized user‐level protocol processing, caching of pinned communication buffers, and deferred processing of completed send operations. In the experimental evaluation, the one‐way bandwidth achieved by stream sockets over VI Architecture was three to four times better than the bandwidth achieved by running legacy protocols over the same interconnect. On the same SAN, high‐performance stream sockets and RPC over VI Architecture achieve significantly better (between 2× and 3× less) latency than conventional stream sockets and RPC over standard networking protocols in a Windows NT? 4.0 environment. Furthermore, our high‐performance RPC transparently improved the network performance of the distributed component object model (DCOM) by a factor of two to three. Copyright © 2001 John Wiley & Sons, Ltd. 相似文献

8.

Software Distributed Shared Memory: a VIA‐based implementation and comparison of sequential consistency with home‐based lazy release consistency

Vadim Iosevich Assaf Schuster 《Software》2005,35(8):755-786

A Distributed Shared Memory (DSM) system provides a distributed application with a shared virtual address space. This article proposes a design for implementing the DSM communication layer on top of the Virtual Interface Architecture (VIA), an industry standard for user‐level networking protocols on high‐speed clusters. User‐level communication protocols operate in user mode, thus removing the operating system kernel's overhead from the critical communication pass, and significantly diminishing communication overhead as a result. We analyze VIA's facilities and limitations in order to ascertain which implementation trade‐offs can be best applied to our development of an efficient communication substrate optimized for DSM requirements. We then implement a multithreaded version of the Home‐based Lazy Release Consistency (HLRC) protocol on top of this substrate. In addition, we compare the performance of this HLRC protocol with that of the Sequential Consistency (SC) protocol in which a Multi View (MV) memory mapping technique was used. This technique enables a fine‐grained access to shared memory, while still relying on the virtual memory hardware to track memory accesses. We perform an ‘apple‐to‐apple’ comparison on the same testbed environment and benchmark suite, and investigate the effectiveness and scalability of both protocols. Copyright © 2005 John Wiley & Sons, Ltd. 相似文献

9.

Scheduling communication in multithreaded programs: experimental results

Juan Carlos Gomez Vernon Rego V. S. Sunderam 《Concurrency and Computation》2006,18(1):1-28

When the critical path of a communication session between end points includes the actions of operating system kernels, there are attendant overheads. Along with other factors, such as functionality and flexibility, such overheads motivate and favor the implementation of communication protocols in user space. When implemented with threads, such protocols may hold the key to optimal communication performance and functionality. Based on implementations of reliable user‐space protocols supported by a threads framework, we focus on our experiences with internal threads' scheduling techniques and their potential impact on performance. We present scheduling strategies that enable threads to do both application‐level and communication‐related processing. With experiments performed on a Sun SPARC‐5 LAN environment, we show how different scheduling strategies yield different levels of application‐processing efficiency, communication latency and packet‐loss. This work forms part of a larger study on the implementation of multiple thread‐based protocols in a single address space, and the benefits of coupling protocols with applications. Copyright © 2005 John Wiley & Sons, Ltd. 相似文献

10.

TCP／IP协议对快速网络的作用分析

曹秉超张媛都志辉《计算机科学》2003,30(5):118-121

1 引文 Myrinet和Gigabit-Ethernet是当今世界上性能最好的两种可以运用于局域网的高性能并行计算的网络系统。它们以其高效、高速的传输特点在广泛的领域里得到了好评。但是,这两种同样是高端的网络产品,其实现的技术在很多方面却是大相径庭的。而在当今的网络协议中,TCP/IP协议是当然的主角。无论是万维网(WWW),还是一般的局域网(LAN),我们都曾经,或者正在使用TCP/IP协议。对于TCP/IP协议,Myrinet和Gigabit-Ethernet有着不同选择。虽然对于Myrinet来说,用户能够直接挂接TCP/IP 相似文献

11.

Myrinet communication

Dubnicki C. Bilas A. Yuqun Chen Damianakis S.N. Kai Li 《Micro, IEEE》1998,18(1):50-52

In last year's IEEE Micro special issue on the Hot Interconnects IV Symposium, we discussed our experiences with client-server computing on the Paragon-based Shrimp multicomputer. Since then we implemented the same virtual memory-mapped communication (VMMC) mechanism on the Myrinet-based Shrimp multicomputer (a set of Pentium PCs connected by a Myrinet network). In both cases we achieved protected, user-level, end-to-end performance close to the hardware limits. However, VMMC imposes a copy for high-level connection-oriented communication libraries. Therefore, we extended the model, and designed and built a new implementation. This update reports our latest work with VMMC on the Myrinet-based Shrimp multicomputer, which we call VMMC-2 相似文献

12.

统一缓存:基于用户层通信的合作缓存技术

张悠慧汪东升郑纬民《计算机研究与发展》2003,40(7):1117-1123

合作缓存机制是集群系统提高整体性能的一种有效方法，其利用高速网络将各个结点的缓存进行合作管理与访问，大幅度提高了缓存的命中率．但传统的合作缓存技术没有考虑到广为应用的高效用户层通信机制的特点．提出一种新的用户层通信与合作缓存技术相融合的缓存机制——集群统一缓存．这一机制充分利用了用户层通信的特点，包括协议精简、零拷贝、虚拟内存映射通信(VMMC)技术等，将缓存与结点通信相融合，减少了集群系统应用程序IO模块的层次与复杂度，提高了系统性能．同时这一机制也顺应了IO子系统日益独立化的发展趋势．该技术已经应用于自行开发的面向对象的Internet服务存储平台——TODS上，具有高效、扩展性好与软件设计简单等特点．相似文献

13.

Firmware-Level Latency Analysis on a Gigabit Network

Jin Hyun-Wook Yoo Chuck Choi Jin-Young 《The Journal of supercomputing》2003,26(1):59-75

Gigabit networks are equipped with increasingly intelligent network interface cards, and the firmware running in the cards does various tasks related to end-to-end communication. For an accurate performance evaluation of gigabit networks, it is very important to characterize and quantify the firmware. However, the firmware has been neglected in the latency analyzes of network protocols.This paper presents an in-depth latency analysis of Myrinet. Our findings include that the major bottleneck is the network interface card itself. This is true especially for so-called lightweight user-level protocols (such as BPI of Myrinet) designed for high-speed communication. Although BPI is very lean and efficient in the host, its sending throughput becomes similar to UDP. This result is very unexpected and surprising. Through firmware-level measurements, we identify that the cause of bottleneck is the DMA performance. 相似文献

14.

底层通信协议中内存映射机制的设计与实现^* 总被引：4，自引：1，他引：3

刘炜郑纬民申俊鞠大鹏《软件学报》1999,10(1):24-28

在底层网络通信协议中使用内存映射机制为用户层应用提供了虚拟网络界面,使用户层能够方便地访问快速通信设备;通过减少系统软件的协议处理开销,有效地减少了网络通信的延迟.讨论了通信协议中的内存映射机制的设计思想和实现过程,提出了通信区的概念,利用通信区有效地完成核心与用户之间的数据交换.同时给出一个实例,对其实现与性能进行了分析. 相似文献

15.

系统域网中用户级通信库的实现

尹宏达史岗胡明昌《计算机工程》2005,31(11):190-192

在系统域网环境中,网络硬件具备非常优良的性能,然而传统的通信库存在大量不必的要软件开销,大幅度地降低了通信性能。通过允许用户进程直接访问网络设备并减少收发过程中的内存拷贝,可以避免由操作系统带来的开销,从而实现用户级通信,降低延迟并提高带宽。经过对用户级通信库的性能分析,可以发现用户级通信库具有更好的性能。相似文献

16.

集群高性能通信系统综述 总被引：1，自引：0，他引：1

杨丰都志辉刘志强《计算机科学》2007,34(5):240-242

本文分析对比了集群系统中几种主要的高性能通信系统： Myrinet, Infiniband和 Quadricso,在硬件方面对比了链路,交换设备与主机适配器,指出各自的组成与特点,在软件方面则描述了各自的软件实现：Myrinet/GM, Infiniband/VAPI 以及 Quadrics/Elanlib的主要特点和功能,最后给出各个系统的实际性能,结果表明：Infiniband系统性能高,结构简单,对应的体系结构具有良好的前景. 相似文献

17.

网络层匿名通信协议综述

下载免费PDF全文

王良民倪晓铃赵蕙《网络与信息安全学报》2020,6(1):11-26

匿名通信系统是一种建立在应用层之上结合利用数据转发、内容加密、流量混淆等多种隐私保护技术来隐藏通信实体关系和内容的覆盖网络。然而,作为覆盖网络运行的匿名通信系统,在性能和安全保障上的平衡问题上存在不足。未来互联网架构的出现使构建基于基础设施的匿名通信系统成为可能。此类匿名通信系统将匿名设计为网络基础设施服务,通过为路由器配备加密操作,可解决匿名网络的可拓展性和性能限制的部分问题,因此也可称它们为网络层匿名通信协议。对现有的网络层匿名通信协议（LAP、Dovetail、Hornet、PHI和Taranet）进行了研究,介绍了网络层匿名通信协议的分类标准,简述其创新点和具体加密思想,并对它们如何在安全性和性能二者之间的权衡进行分析,也指出了这几种网络匿名通信协议的优势和不足,最后提出在匿名通信系统发展的过程中所面临的挑战和需要深入研究的问题。相似文献

18.

曙光超级服务器上DMPI和DPVM的设计

赵毅王哲马捷《计算机工程与应用》2003,39(25):127-131

介绍了运行于曙光超级服务器上的DMPI和DPVM的设计与实现。DMPI支持RMA和动态进程管理的功能。DMPI和DPVM在对ADI层进行灵活有效的扩展(DADI-E)的基础上提供了一个高性能的通信层。该文对DADI-E的特点进行了阐述,包括RMA机制、动态进程管理和对PVM的支持。而且,DMPI还提供了一个特殊的数据通路———流水发送,它可以改善DMPI的带宽,使之接近于底层通信协议的带宽峰值。最后还给出了DMPI和DPVM在Myinet卡上的性能指标。相似文献

19.

天河-1A互连系统的接口设计

刘路张磊谢旻王永庆《计算机工程与科学》2013,35(2):18-25

NIC是高性能互连网络THNet的网络接口芯片,基于自主研发的通信协议,它高效地实现了无连接、零拷贝、用户级通信的RDMA传输机制,基于该机制的MPI实现具有极高的系统可扩展性。实现了基于控制报文触发的描述符队列处理机制,以支持卸载的聚合通信,包括广播和栅栏同步。使用NIC芯片的网络接口卡在测试中获得了1.57μs的最小单边延迟和6.34GB/s的带宽。NIC已成功应用于2010年TOP500排名世界第一的天河-1A超级计算机。相似文献

20.

无线传感器网络中的地址分配协议 总被引：1，自引：0，他引：1

杜治高钱德沛刘轶《软件学报》2009,20(10):2787-2798

地址用来标识节点,使能网络通信协议,在无线传感器网络中扮演重要角色.由于无线传感器网络中节点众多,再考虑其网络动态性,手动地为每个节点分配地址是一件繁琐甚至无法完成的工作,于是,地址分配协议成为必需.由于无线传感器网络自身所具有的特点,传统的DHCP协议和ad hoc网络的地址分配协议对其不再适用.分析了无线传感器网络地址分配协议的必要性,总结了地址分配协议需要解决的问题和所面临的挑战,并对已有地址分配协议进行了分类.介绍和比较了当前有代表性的地址分配协议,并指出了它们的问题,分析了地址分配协议下一步研究的重点. 相似文献