期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Scheduling communication in multithreaded programs: experimental results

Juan Carlos Gomez Vernon Rego V. S. Sunderam 《Concurrency and Computation》2006,18(1):1-28

When the critical path of a communication session between end points includes the actions of operating system kernels, there are attendant overheads. Along with other factors, such as functionality and flexibility, such overheads motivate and favor the implementation of communication protocols in user space. When implemented with threads, such protocols may hold the key to optimal communication performance and functionality. Based on implementations of reliable user‐space protocols supported by a threads framework, we focus on our experiences with internal threads' scheduling techniques and their potential impact on performance. We present scheduling strategies that enable threads to do both application‐level and communication‐related processing. With experiments performed on a Sun SPARC‐5 LAN environment, we show how different scheduling strategies yield different levels of application‐processing efficiency, communication latency and packet‐loss. This work forms part of a larger study on the implementation of multiple thread‐based protocols in a single address space, and the benefits of coupling protocols with applications. Copyright © 2005 John Wiley & Sons, Ltd. 相似文献

2.

Virtual-memory-mapped network interfaces

Blumrich M.A. Dubnicki C. Felten E.W. Kai Li Mesarina M.R. 《Micro, IEEE》1995,15(1):21-28

In today's multicomputers, software overhead dominates the message-passing latency cost. We designed two multicomputer network interfaces that significantly reduce this overhead. Both support virtual-memory-mapped communication, allowing user processes to communicate without expensive buffer management and without making system calls across the protection boundary separating user processes from the operating system kernel. Here we compare the two interfaces and discuss the performance trade-offs between them 相似文献

3.

IN‐Tune: an In‐Situ non‐invasive performance tuning tool for multi‐threaded Linux on symmetric multiprocessing Pentium workstations

Jeremy B. Rodgers Rhonda Kay Gaede Jeffrey H. Kulick 《Software》1999,29(9):775-792

This paper documents the design and implementation of the IN‐Tune software tool suite, which enables a user to collect real‐time code and hardware profiling information on Intel‐based symmetric multiprocessors running the Linux operating system. IN‐Tune provides a virtually non‐invasive tool for performance analysis and tuning of programs. Unlike other analysis tools, IN‐Tune isolates data with respect to individual threads. It also utilizes performance monitoring hardware registers to permit instrumentation of individual threads as they run in‐situ, thus collecting data with appropriate considerations for a multiprocessor environment. Data can be sampled using two different mechanisms. First, the user can collect data by making calls to the system upon the occurrence of specific software events. Secondly, data can be collected at a fixed, fine grain (e.g. 1–10 microseconds) interval using either software or hardware interrupts. To allow observation of codes for which source code modification is impractical or impossible, a ‘shell’ task is created which permits monitoring without code modification. Although this work deals with Intel processors and Linux, the widespread availability of performance monitoring registers in modern processors makes this work widely applicable. Copyright © 1999 John Wiley & Sons, Ltd. 相似文献

4.

Software Distributed Shared Memory: a VIA‐based implementation and comparison of sequential consistency with home‐based lazy release consistency

Vadim Iosevich Assaf Schuster 《Software》2005,35(8):755-786

A Distributed Shared Memory (DSM) system provides a distributed application with a shared virtual address space. This article proposes a design for implementing the DSM communication layer on top of the Virtual Interface Architecture (VIA), an industry standard for user‐level networking protocols on high‐speed clusters. User‐level communication protocols operate in user mode, thus removing the operating system kernel's overhead from the critical communication pass, and significantly diminishing communication overhead as a result. We analyze VIA's facilities and limitations in order to ascertain which implementation trade‐offs can be best applied to our development of an efficient communication substrate optimized for DSM requirements. We then implement a multithreaded version of the Home‐based Lazy Release Consistency (HLRC) protocol on top of this substrate. In addition, we compare the performance of this HLRC protocol with that of the Sequential Consistency (SC) protocol in which a Multi View (MV) memory mapping technique was used. This technique enables a fine‐grained access to shared memory, while still relying on the virtual memory hardware to track memory accesses. We perform an ‘apple‐to‐apple’ comparison on the same testbed environment and benchmark suite, and investigate the effectiveness and scalability of both protocols. Copyright © 2005 John Wiley & Sons, Ltd. 相似文献

5.

一种改进的共享存储通信机制 总被引：1，自引：1，他引：0

王世铀毕硕本《小型微型计算机系统》1999,20(9):649-653

在操作系统的设计和实现中,并享存在是一种普遍采用的进程通信机制。目前的实现中,大多是将共享存储区放于用户空间中,并随着程序运行的需要而动态的创建、附接、断接和撤消共享区,并对其进行换入／换出等操作。这些操作引入了较多的额外开销,影响了进程通信的效率。另一方面,在目前的实现中,对共享区中中空间的管理和使用缺乏灵活性。本文提出了一种使用核心中固定区域作为共享存储区来实现进程间通信的方法,并对共享区中空相似文献

6.

LVT: a layered verification technique for distributed computing systems

Cui Zhang Brian R. Becker Dave Peticolas Ronald A. Olsson Karl N. Levitt 《Software Testing, Verification and Reliability》1999,9(2):107-133

This paper presents a layered verification technique, called LVT, for the verification of distributed computing systems with multiple component layers. Each lower layer in such a system provides services in support of functionality of the higher layer. By taking a very general view of programming languages as interfaces of systems, LVT treats each layer in a distributed computing system as a distributed programming language. Each relatively higher‐level language in the computing system is implemented in terms of a lower‐level language. The verification of each layer in a distributed computing system can then be viewed as the verification of implementation correctness for a distributed language. This paper also presents the application of LVT to the verification of a distributed computing system, which has three layers: a small high‐level distributed programming language; a multiple processor architecture consisting of an instruction set and system calls for inter‐process message passing; and a network interface. Programs in the high‐level language are implemented by a compiler mapping from the language layer to the multiprocessor layer. System calls are implemented by network services. LVT and its application demonstrate that the correct execution of a distributed program, most notably its inter‐process communication, is verifiable through layers. The verified layers guarantee the correctness of (1) the compiled code that makes reference to operating system calls, (2) the operating system calls in terms of network calls, and (3) the network calls in terms of network transmission steps. The specification and verification involved are carried out by using the Cambridge Higher Order Logic (HOL) theorem proving system. Copyright © 1999 John Wiley & Sons, Ltd. 相似文献

7.

User‐level operating system transactions

Diomidis Spinellis 《Software》2009,39(14):1215-1233

User‐level operating system transactions allow system administrators and ordinary users to perform a sequence of file operations and then commit them as a group, or abort them without leaving any trace behind. Such a facility can aid many system administration and software development tasks. The snapshot isolation concurrency control mechanism allows transactions to be implemented without locking individual system calls; conflicts are detected when the transaction is ready to commit. Along these lines we have implemented a user‐space transaction monitor that is based on ZFS snapshots and a file system event monitor. Transactions are committed through a robust and efficient algorithm that merges the operations performed on a file system's clone back to its parent. Both the performance impact and the implementation cost of the transaction monitor we describe are fairly small. Copyright © 2009 John Wiley & Sons, Ltd. 相似文献

8.

Performance analysis of wireless mobile networks with queueing priority and guard channels

Wei Feng Masashi Kowada 《International Transactions in Operational Research》2008,15(4):481-508

The blocking probability of originating calls and forced termination probability of handoff calls are important criteria in performance evaluation of integrated wireless mobile communication networks. In this paper, we model call traffic in a single cell in an integrated voice/data wireless mobile network by a finite buffer queueing system with queueing priority and guard channels. We categorize calls into three types of service classes: originating voice calls, handoff voice calls and data calls. The arrival streams of calls are mutually independent Poisson processes, and channel holding times are exponentially distributed with different means. We describe the behavior of the system by a three‐dimensional continuous‐time Markov process and present the explicit expression for steady‐state distribution of queue lengths using a recursive approach. Furthermore, we calculate the blocking, forced termination probabilities and derive the Laplace–Stieltjes transform of the stationary distribution of actual waiting times and their arbitrary kth moment. Finally, we give some numerical results and discuss the optimization problem for the number of guard channels. 相似文献

9.

The scalable process topology interface of MPI 2.2

Torsten Hoefler Rolf Rabenseifner Hubert Ritzdorf Bronis R. de Supinski Rajeev Thakur Jesper Larsson Trff 《Concurrency and Computation》2011,23(4):293-310

The Message‐passing Interface (MPI) standard provides basic means for adaptations of the mapping of MPI process ranks to processing elements to better match the communication characteristics of applications to the capabilities of the underlying systems. The MPI process topology mechanism enables the MPI implementation to rerank processes by creating a new communicator that reflects user‐supplied information about the application communication pattern. With the newly released MPI 2.2 version of the MPI standard, the process topology mechanism has been enhanced with new interfaces for scalable and informative user‐specification of communication patterns. Applications with relatively static communication patterns are encouraged to take advantage of the mechanism whenever convenient by specifying their communication pattern to the MPI library. Reference implementations of the new mechanism can be expected to be readily available (and come at essentially no cost), but non‐trivial implementations pose challenging problems for the MPI implementer. This paper is first and foremost addressed to application programmers wanting to use the new process topology interfaces. It explains the use and the motivation for the enhanced interfaces and the advantages gained even with a straightforward implementation. For the MPI implementer, the paper summarizes the main issues in the efficient implementation of the interface and explains the optimization problems that need to be (approximately) solved by a good MPI library. Copyright © 2010 John Wiley & Sons, Ltd. 相似文献

10.

一种高效的用户级通信协议的研究与实现

李斌辛海红胡铭曾《计算机工程》2006,32(1):148-150

介绍了一种基于零拷贝思想的用户级通信协议的设计与实现。通过对传统操作系统在处理网络数据包的过程中多次拷贝而造成的延迟进行了仔细分析,设计了一种内存映射机制,使用户的应用程序避开了操作系统核心的干预,直接与网络接口进行交互,并有效地完成操作系统核心与用户之间的数据交换,从而地减少了网络通信的开销与延迟。相似文献

11.

ipcmd: a command‐line interface to System V semaphores and message queues

Nathan T. Weeks Marina Kraeva Glenn R. Luecke 《Concurrency and Computation》2014,26(2):396-411

相似文献

12.

Design and implementation of a file system with on‐the‐fly data compression for GNU/Linux

Praveen B. Deepak Gupta Rajat Moona 《Software》1999,29(10):863-874

Data compression techniques have long been assisting in making effective use of disk, network and other resources. Most compression utilities require explicit user action for compressing and decompressing of file data. However, there are some systems in which compression and decompression of file data is done transparently by the operating system. A compressed file requires fewer sectors for storage on the disk. Hence, incorporating data compression techniques into a file system gives the advantage of a larger effective disk space. At the same time, the additional time needed for compression and decompression of file data gets compensated for to a large extent by the time gained because of fewer disk accesses. In this paper we describe the design and implementation of a file system for the Linux kernel, with the feature of on‐the‐fly data compression and decompression in a fashion that is transparent to the user. We also present some experimental results which show that the performance of our file system is comparable to that of Ext2fs, the native file system for Linux. Copyright © 1999 John Wiley & Sons, Ltd. 相似文献

13.

The design and implementation of the exported procedure call

H. Richard Kendall Vincent W. Freeh 《Software》2002,32(1):83-98

This paper describes the exported procedure call, a mechanism that pushes computation out of the operating system kernel and into user space. It supports a simple, secure model for system extensions. An exported procedure call incurs overhead crossing of the user‐kernel boundary, but once in user space, it has greater security and usability and is significantly simpler than an equivalent kernel operation. This paper demonstrates the capabilities of the exported procedure call by discussing two implementations. One is the Modify‐on‐access (Mona) file system and the other is the Magi device interface. Mona and Magi use the exported procedure call in order to safely execute untrusted or complex system extensions. This paper shows that in situations where raw kernel performance is not paramount, the exported procedure call is desirable. Copyright © 2001 John Wiley & Sons, Ltd. 相似文献

14.

Efficient multicast design in a microkernel environment

A. Averbuch A. Cohen 《Concurrency and Computation》1998,10(7):497-548

Multicast has become popular in recent years with the introduction of new, very fast networks. Existing solutions of the multicast design and implementation problems are either inefficient in microkernel environments or too expensive in terms of the host operating system overhead involved. In our search for a new solution we investigate various aspects of the problem. Exploring the desired semantics, we come to the conclusion that control functions, group maintenance algorithms and various ordering semantics can be implemented on top of the basic services, and that the efficiency of these implementations is less critical than that of the basic services. We describe a few naive solutions to supporting multicast in microkernels and show their limitations. Then we suggest a new solution to the problem and show analytically that it is significantly better than other solutions. The solution we decided to implement is a hybrid solution supporting control operations in user space and data operations in the kernel. It enables the semantics of multicast or group communication abstractions to be separated from the operating system support and mechanisms used to implement those abstractions. Since we propose an efficient solution while introducing minimal changes to the Mach kernel, our hybrid solution suggests splitting the group communication burden between the microkernel and a group server (GS). To implement our hybrid solution the Mach kernel has been modified. Finally, we present an implementation of our new solution for the Mach 3.0 microkernel environment and show a very significant measured speedup in performance of multicast operations over naive methods for multicasting. © 1998 John Wiley & Sons, Ltd. 相似文献

15.

Implementation issues in the development of the parsec parser

Mary P. Harper Randall A. Helzerman Carla B. Zoltowski Boon-Dock Yeo Yin Chan Todd Stewart Bryan L. Pellom 《Software》1995,25(8):831-862

This paper describes the implementation of a constraint-based parser, PARSEC (Parallel ARchitecture SEntence Constrainer), which has the required flexibility that a user may easily construct a custom grammar and test it. Once the user designs grammar parameters, constraints, and a lexicon, our system checks them for consistency and creates a parser for the grammar. The parser has an X-windows interface that allows a user to view the state of a parse of a sentence, test new constraints, and dump the constraint network to a file. The parser has an option to perform the computationally expensive constraint propagation steps on the MasPar MP-1. Stream and socket communication was used to interface the MasPar constraint parser with a standard X-windows interface on our Sun Sparcstation. The design of our heterogeneous parser has benefitted from the use of object-oriented techniques. Without these techniques, it would have been more difficult to combine the processing power of the MasPar with a Sun Sparcstation. Also, these techniques allowed the parser to gracefully evolve from a system that operated on single sentences, to one capable of processing word graphs containing multiple sentences, consistent with speech processing. This system should provide an important component of a real-time speech understanding system. 相似文献

16.

RTLinux中分布式实时IPC的设计与实现

章勤刘淑英《计算机工程与科学》2004,26(7):5-7

RTLinux是一个具有硬实时性能的单机实时操作系统，它提供了多种进程间通信机制，如信号量、消息队列和RT-fifo等。在RTLinux提供的进程间通信机制基础上，本文提出了一种在RTLinux中设计并实现分布式实时IPC模块的具体方法，最后详细阐述了分布式实时IPC模块的工作流程和各功能模块的实现。相似文献

17.

An event‐flow model of GUI‐based applications for testing

Atif M. Memon 《Software Testing, Verification and Reliability》2007,17(3):137-157

Graphical user interfaces (GUIs) are by far the most popular means used to interact with today's software. The functional correctness of a GUI is required to ensure the safety, robustness and usability of an entire software system. GUI testing techniques used in practice are resource intensive; model‐based automated techniques are rarely employed. A key reason for the reluctance in the adoption of model‐based solutions proposed by researchers is their limited applicability; moreover, the models are expensive to create. Over the past few years, the present author has been developing different models for various aspects of GUI testing. This paper consolidates all of the models into one scalable event‐flow model and outlines algorithms to semi‐automatically reverse‐engineer the model from an implementation. Earlier work on model‐based test‐case generation, test‐oracle creation, coverage evaluation, and regression testing is recast in terms of this model by defining event‐space exploration strategies (ESESs) and creating an end‐to‐end GUI testing process. Three such ESESs are described: for checking the event‐flow model, test‐case generation, and test‐oracle creation. Two demonstrational scenarios show the application of the model and the three ESESs for experimentation and application in GUI testing. Copyright © 2007 John Wiley & Sons, Ltd. 相似文献

18.

Communicating between the kernel and user‐space in Linux using Netlink sockets

Pablo Neira‐Ayuso Rafael M. Gasca Laurent Lefevre 《Software》2010,40(9):797-810

When developing Linux kernel features, it is a good practice to expose the necessary details to user‐space to enable extensibility. This allows the development of new features and sophisticated configurations from user‐space. Generally, software developers have to face the task of looking for a good way to communicate between the kernel and user‐space in Linux. This tutorial introduces you to Netlink sockets, a flexible and extensible messaging system that provides communication between kernel and user‐space. We provide the fundamental guidelines for practitioners who wish to develop Netlink‐based interfaces. Copyright © 2010 John Wiley & Sons, Ltd. 相似文献

19.

双网通信模式测控系统的研究与实现

王学慧靳斌梁凯刘亮刘贵王艺《电脑与微电子技术》2014,(2):66-71

提出一种具有Wi-Fi/GPRS双网模式通信功能的测控系统设计。该系统的网络传输通道既可以是Wi-Fi网络,也可以是GPRS网络,双网模式解决测控设备快速接网时,布线复杂、成本高和可靠性差等问题,为远程通信事业提供便利节约成本。手机监控终端是Android系统智能手机,如果手机使用者有用户的监视密码,则可以监视用户现场的所有模块的实时数据和初始化参数。如果手机使用者有用户的控制密码则可以修改用户现场的参数,实现远方校准,修改模块的工作模式。相似文献

20.

Symmetric multiprocessing from boot to virtualization

下载免费PDF全文

Robert Denz Scott Brookes Martin Osterloh Stephen Kuhn Stephen Taylor 《Software》2018,48(3):681-718

The x86 multicore processor architecture and the concepts associated with symmetric multiprocessing have become the linchpin of modern operating systems and cloud computing. A solid understanding of these technologies has become critical to any developer entering the field. Unfortunately, the complexity associated with discovering, enabling, using, and virtualizing multiple cores has created a paucity in the available documentation, transferable knowledge, and readable code exemplars. This paper describes our experience in overcoming these hurdles in the design of a from scratch, multi‐core operating system––Bear––for secure and resilient cloud computing. In particular, we trace the intricacies involved in the development of a multi‐core microkernel with an integrated multi‐core hypervisor. By exploring the implementation details, from bootstrapping through core virtualization to process scheduling, this paper aims to fill the knowledge gaps, highlight potential pitfalls, and introduce multicore development in a concise start‐to‐finish exemplar. 相似文献