期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Finding a suitable checkpoint and recovery protocol for a distributed application

《Journal of Parallel and Distributed Computing》2006,66(5):732-749

Checkpoint and recovery protocols are commonly used in distributed applications for providing fault tolerance. The performance of a checkpoint and recovery protocol is judged by the amount of computation it can save against the amount of overhead it incurs. This performance depends on different system and application characteristics, as well as protocol specific parameters. Hence, no single checkpoint and recovery protocol works equally well for all applications, and given a distributed application and a system it will run on, it is important to choose a protocol that will give the best performance for that system and application. In this paper, we present a scheme to automatically identify a suitable checkpoint and recovery protocol for a given distributed application running on a given system. The scheme involves a novel technique for finding the similarity between the communication pattern of two distributed applications that is of independent interest also. The similarity measure is based on a graph similarity problem. We present a heuristic for the graph similarity problem. Extensive experimental results are shown both for the graph similarity heuristic and the automatic identification scheme to show that an appropriate checkpoint and recovery protocol can be chosen automatically for a given application. 相似文献

2.

A Distributed Error Recovery Technique and Its Implementation and Application on UNIX

下载免费PDF全文

Zhou Di Xu Xiangwen 《计算机科学技术学报》1990,5(2):127-138

This paper presents a checkpoint setting techniqute to eliminate domino effect in backward recovery in disttributed systems,which is very efficient,powerful,widely applicable and easy to be implememted,Besides theoretical analysis,an implementation on UNIX system and a package for software fault-tolerance are introduced.Then the problems of checkpoint management and process termination are discussed. 相似文献

3.

The “Always Best Packet Switching” architecture for SIP-based mobile multimedia services

Vittorio Ghini Stefano Ferretti Fabio PanzieriAuthor vitae 《Journal of Systems and Software》2011,84(11):1827-1851

This paper presents a distributed architecture for the provision of seamless and responsive mobile multimedia services. This architecture allows its user applications to use concurrently all the wireless network interface cards (NICs) a mobile terminal is equipped with. In particular, as mobile multimedia services are usually implemented using the UDP protocol, our architecture enables the transmission of each UDP datagram through the “most suitable” (e.g. most responsive, least loaded) NIC among those available at the time a datagram is transmitted. We term this operating mode of our architecture Always Best Packet Switching (ABPS). ABPS enables the use of policies for load balancing and recovery purposes. In essence, the architecture we propose consists of the following two principal components: (i) a fixed proxy server, which acts as a relay for the mobile node and enables communications from/to this node regardless of possible firewalls and NAT systems, and (ii) a proxy client running in the mobile node responsible for maintaining a multi-path tunnel, constructed out of all the node's NICs, with the above mentioned fixed proxy server. We show how the architecture supports multimedia applications based on the SIP and RTP/RTCP protocols, and avoids the typical delays introduced by the two way message/response handshake of the SIP signaling protocol. Experimental results originated from the implementation of a VoIP application on top of the architecture we propose show the effectiveness of our approach. 相似文献

4.

Automated online monitoring of distributed applications through external monitors 总被引：1，自引：0，他引：1

G. Khanna P. Varadharajan S. Bagchi 《Dependable and Secure Computing, IEEE Transactions on》2006,3(2):115-129

It is a challenge to provide detection facilities for large-scale distributed systems running legacy code on hosts that may not allow fault tolerant functions to execute on them. It is tempting to structure the detection in an observer system that is kept separate from the observed system of protocol entities, with the former only having access to the latter's external message exchanges. In this paper, we propose an autonomous self-checking monitor system, which is used to provide fast detection to underlying network protocols. The monitor architecture is application neutral and, therefore, lends itself to deployment for different protocols, with the rulebase against which the observed interactions are matched, making it specific to a protocol. To make the detection infrastructure scalable and dependable, we extend it to a hierarchical monitor structure. The Monitor structure is made dynamic and reconfigurable by designing different interactions to cope with failures, load changes, or mobility. The latency of the monitor system is evaluated under fault free conditions, while its coverage is evaluated under simulated error injections. 相似文献

5.

路由协议一致性测试系统研究及实现 总被引：5，自引：2，他引：3

李建周颢赵保华《计算机工程与应用》2005,41(16):119-123

文章通过对路由协议特点的分析,指出了路由协议一致性测试所包含的内容和目的,根据测试的内容和对现有的测试方法和测试系统的研究,提出了一种针对路由协议一致性测试的分布式虚拟测试法,它使用一个控制模块组织多个虚拟测试体协同工作实现对待测体的测试。依据这种方法,实现了一个可扩展的路由协议测试系统,并完成了IPv6路由协议的一致性测试。文中以对OSPFv3的测试作为示例。相似文献

6.

网格环境下的航空发动机集成设计与分布仿真研究

曹源金先龙《计算机辅助设计与图形学学报》2005,17(8):1851-1856

建立了一个开发航空发动机分布仿真的系统环境．该环境中子模型具有自治性,子模型的设计者可以根据自己的需要选择所使用的工具,定义变量以及他们与其他设计者之间的关系．应用新兴的网格技术建立了一个集成设计和分布仿真的环境框架,它可以灵活地建立和修改基于部件对象模型建立起来的航空发动机模型,并尝试解决仿真中存在的多学科耦合以及大计算量的问题．该框架具有图形化界面,可以方便地更改发动机的参数和结构,其所具有网格技术的易扩展性也为今后建立更复杂的发动机模型提供了良好的平台．研究人员可以应用该系统提供的可扩展设计和仿真环境,灵活地组建新的航空发动机模型并进行仿真．相似文献

7.

Memory and Network Architecture Interaction in an Optically Interconnected Distributed Shared Memory System

《Journal of Parallel and Distributed Computing》1995,25(2):144-161

This paper develops a performance model of an optically interconnected parallel computer system operating in a distributed shared memory environment. The performance model is developed to reflect the impact of low level optical media access protocol and optical device switching latency on high level system performance. This enables the model to predict the performance impact of supporting distributed shared memory with different address allocation schemes and media access protocols. The passive star-coupled photonic network operates through wavelength division multiple access. Two media access protocols are examined for this WDM network, both are designed to operate in a multiple-channel multiple-access environment and require each node to possess a wavelength tunable transmitter and a fixed (or slow tunable) receiver. A semi-Markov model has been developed to study the interaction of the distributed shared memory architecture and the two access protocols of the photonic network. This analytical model has been validated by extensive simulation. The model is then used to examine the system performance with varying numbers of nodes and wavelength channels and varying, memory and channel access times. 相似文献

8.

一种SIP NAT应用网关的设计与实现 总被引：6，自引：0，他引：6

何永龙林浒雷为民《小型微型计算机系统》2002,23(8):913-916

使用私有地址的SIP软交换系统用户如何与公网用户进行会话，即如何SIP消息进行NAT操作，目前还没有应用标准，现有草案中提出的方法无论是否实现于应用层，都对SIP协议本身进行了扩展，这无疑给具体实现增加了很大的难度，同时也带来了兼容性问题，本文从SIP消息体的特点出发，提出了一种无需扩展SIP协议的易于实现的应用层解决方案，并结合呼叫流程详细叙述了具体实现过程。相似文献

9.

Software Distributed Shared Memory: a VIA‐based implementation and comparison of sequential consistency with home‐based lazy release consistency

Vadim Iosevich Assaf Schuster 《Software》2005,35(8):755-786

A Distributed Shared Memory (DSM) system provides a distributed application with a shared virtual address space. This article proposes a design for implementing the DSM communication layer on top of the Virtual Interface Architecture (VIA), an industry standard for user‐level networking protocols on high‐speed clusters. User‐level communication protocols operate in user mode, thus removing the operating system kernel's overhead from the critical communication pass, and significantly diminishing communication overhead as a result. We analyze VIA's facilities and limitations in order to ascertain which implementation trade‐offs can be best applied to our development of an efficient communication substrate optimized for DSM requirements. We then implement a multithreaded version of the Home‐based Lazy Release Consistency (HLRC) protocol on top of this substrate. In addition, we compare the performance of this HLRC protocol with that of the Sequential Consistency (SC) protocol in which a Multi View (MV) memory mapping technique was used. This technique enables a fine‐grained access to shared memory, while still relying on the virtual memory hardware to track memory accesses. We perform an ‘apple‐to‐apple’ comparison on the same testbed environment and benchmark suite, and investigate the effectiveness and scalability of both protocols. Copyright © 2005 John Wiley & Sons, Ltd. 相似文献

10.

基于SPI演算的移动自主网络安全路由协议分析

王英龙徐东红王美琴高仲合《计算机工程与应用》2006,42(6):132-135

安全协议是许多分布式系统安全的基础,也是MANET网络的基础,确保MANET路由协议的安全运行是极为重要的。对于MANET的特点,设计一个可靠的安全路由协议是必须的,也是一个艰巨的任务,但大多数的安全路由协议都是通过模拟结果来进行解释的,缺乏严格形式化分析来确保其安全属性。在传统的安全属性中,加密协议已经被形式化分析许多年了,然而去形式化分析移动adhoc网路由协议的工作并没有出现已成熟的方法和理论的文献。论文针对SRP(secureroutingprotocol)协议模型用SPI演算做出形式化分析,在论文提出的攻击者进程模型下,可以推导出SRP产生一定的脆弱性。相似文献

11.

Termination detection protocols for mobile distributed systems 总被引：1，自引：0，他引：1

Yu-Chee Tseng Cheng-Chung Tan 《Parallel and Distributed Systems, IEEE Transactions on》2001,12(6):558-566

This paper studies a fundamental problem, the termination detection problem, in distributed systems. Under a wireless network environment, we show how to handle the host mobility and disconnection problems. In particular, when some distributed processes are temporarily disconnected, we show how to capture a weakly terminated state where silence has been reached only by those currently connected processes. A user may desire to know such a state to tell whether the mobile distributed system is still running or is silent because some processes are disconnected. Our protocol tries to exploit the network hierarchy by combining two existing protocols together. It employs the weight-throwing scheme on the wired network side, and the diffusion-based scheme on each wireless cell. Such a hybrid protocol can better pave the gaps of computation and communication capability between static and mobile hosts, thus more scalable to larger distributed systems. Analysis and simulation results are also presented 相似文献

12.

Hierarchical simulation approach to accurate fault modeling forsystem dependability evaluation

Kalbarczyk Z. Iyer R.K. Ries G.L. Patel J.U. Lee M.S. Xiao Y. 《IEEE transactions on pattern analysis and machine intelligence》1999,25(5):619-632

This paper presents a hierarchical simulation methodology that enables accurate system evaluation under realistic faults and conditions. In this methodology, effects of low-level (i.e., transistor or circuit level) faults are propagated to higher levels (i.e., system level) using fault dictionaries. The primary fault models are obtained via simulation of the transistor-level effect of a radiation particle penetrating a device. The resulting current bursts constitute the first-level fault dictionary and are used in the circuit-level simulation to determine the impact on circuit latches and flip-flops. The latched outputs constitute the next level fault dictionary in the hierarchy and are applied in conducting fault injection simulation at the chip-level under selected workloads or application programs. Faults injected at the chip-level result in memory corruptions, which are used to form the next level fault dictionary for the system-level simulation of an application running on simulated hardware. When an application terminates, either normally or abnormally, the overall fault impact on the software behavior is quantified and analyzed. The system in this sense can be a single workstation or a network. The simulation method is demonstrated and validated in the case study of Myrinet (a commercial, high-speed network) based network system 相似文献

13.

分布式IP分片处理问题的研究

郭方方杨永田《计算机科学》2006,33(11):34-37

传统的IP分片处理技术只适用于单检查点网络。但随着分布式网络应用的飞速发展,这种传统的TCP/IP协议的基础技术越来越不能适应新的网络环境,而且给网络新技术的推广和应用带来了阻碍。该文在分布式HASH算法的基础上提出了在分布式环境下,多点间协同处理IP分片问题的解决办法,将IP分片赋予某个特定的HASH函数值并由相应的检查点来处理。除此之外还利用折叠异或法提高了HASH算法的计算速度,并且利用前插链表法提高了HASH算法解决冲突问题的效能。通过仿真试验表明该算法可以应用于分布式的网络环境,并且拥有较好的网络适应性和稳定性。相似文献

14.

Design and implementation of efficient communication abstractions on the Virtual Interface Architecture: Stream sockets and RPC experience

Hemal V. Shah Rajesh S. Madukkarumukumana 《Software》2001,31(11):1043-1065

The emergence and standardization of system area networks (SANs) has provided distributed applications with a medium for high‐bandwidth, low‐latency communication. Standard user‐level networking architecture such as the Virtual Interface (VI) Architecture enables distributed applications to perform low overhead communication over SANs. The VI Architecture significantly reduces system processing overheads and provides each consumer process with a protected, directly accessible interface to the network hardware. Developing distributed applications using low‐level primitives provided by user‐level networking architecture like the VI Architecture is complex and requires significant effort. This paper describes how high‐level communication paradigms like stream sockets and remote procedure call (RPC) can be efficiently built over the VI Architecture. To evaluate performance benefits for standard client–server and multi‐threaded environments, our focus is on off‐the‐shelf sockets and RPC interfaces and commercially available VI Architecture‐based SANs. The key design techniques developed in this paper include credit‐based flow control, decentralized user‐level protocol processing, caching of pinned communication buffers, and deferred processing of completed send operations. In the experimental evaluation, the one‐way bandwidth achieved by stream sockets over VI Architecture was three to four times better than the bandwidth achieved by running legacy protocols over the same interconnect. On the same SAN, high‐performance stream sockets and RPC over VI Architecture achieve significantly better (between 2× and 3× less) latency than conventional stream sockets and RPC over standard networking protocols in a Windows NT? 4.0 environment. Furthermore, our high‐performance RPC transparently improved the network performance of the distributed component object model (DCOM) by a factor of two to three. Copyright © 2001 John Wiley & Sons, Ltd. 相似文献

15.

基于协议引擎的协议实现方法研究

沈俊潘建平《计算机研究与发展》1998,35(11):1037-1041

为了从实现的角度解决网络协议相似性和多样性的矛盾，以及不同协议不同实现方法对系统资源的消耗和协议管理维护的复杂性问题，文中在分析了基于ＯＳＩ协议实现的基本问题的基础上，提出用面向对象方法和虚拟概念解决通用性难题，并且提出了基于协议引擎的协议实现方法，分析了它的结构特点和应用时的几点考虑，认为效率是主要评价因素，文中还给出了Ｊａｖａ环境上应用协议引擎网络管理模块和网关路由器的实现方案。相似文献

16.

An extensible probe architecture for network protocol performance measurement

David Watson G. Robert Malan Farnam Jahanian 《Software》2004,34(1):47-67

This paper describes the architecture, implementation, and application of Windmill, a passive network protocol performance measurement tool. Windmill enables experimenters to measure a broad range of protocol performance metrics both by reconstructing application‐level network protocols and by exposing the underlying protocol layers' events. Windmill is split into three functional components: a dynamically compiled Windmill Protocol Filter (WPF), a set of abstract protocol modules, and an extensible experiment engine. To demonstrate Windmill's utility, we present the results from several experiments. The first set of experiments validates a possible cause for the correlation between Internet routing instability and network bandwidth usage. The second set of experiments highlights Windmill's ability to act as a driver for a complementary active Internet measurement infrastructure, its ability to perform online data reduction, and the non‐intrusive measurement of a closed system. Copyright © 2003 John Wiley & Sons, Ltd. 相似文献

17.

Fault-tolerant distributed shared memory on a broadcast-based architecture

Katsinis C. Hecht D. 《Parallel and Distributed Systems, IEEE Transactions on》2004,15(12):1082-1092

Due to advances in fiber-optics and VLSI technology, interconnection networks that allow multiple simultaneous broadcasts are becoming feasible. Distributed-shared-memory implementations on such networks promise high performance even for applications with small granularity. This paper presents the architecture of one such implementation, called the simultaneous optical multiprocessor exchange bus, and examines the performance of augmented DSM protocols that exploit the natural duplication of data to maintain a recovery memory in each processing node and provide basic fault tolerance. Simulation results show that the additional data duplication necessary to create fault-tolerant DSM causes no reduction in system performance during normal operation and eliminates most of the overhead at checkpoint creation. Under certain conditions, data blocks that are duplicated to maintain the recovery memory are utilized by the underlying DSM protocol, reducing network traffic, and increasing the processor utilization significantly. 相似文献

18.

多水下机器人水声网络仿真框架设计

高勇李一平《微计算机信息》2010,(14)

为了满足多水下机器人系统的仿真需求,根据水声信道和水声通信机模型,设计了一个基于局域网的水声网络通信协议仿真框架。该框架能够为分布式交互的水下机器人网络提供一个共享的虚拟水声信道,模拟通信协议在多个水下机器人节点上的运行情况。最后给出的是一个典型网络拓扑下ALOHA协议的仿真结果。相似文献

19.

自组网环境下基于QoS的路由协议 总被引：21，自引：0，他引：21

英春史美林《计算机学报》2001,24(10):1026-1033

自组网是一组带有无线收发装置的移动节点组成的一个多跳的临时性的自治系统。在这种环境中,由于节点无线通信覆盖范围的有限性,需要借助其它中间节点进行分组转发到达信宿。常规路由协议在自组网环境无法有效地正常运行。文中首先描述了自组网的概念和特点,在此基础上提出了自组网环境下的基于QoS的路由协议。该路由协议的主要思想是根据无线链路两个重要指标：平均错误分组率和生存时间进行路由发现、选择和维护。相对于跳数而言,它们向用户提供了最有可能满足特定QoS需求的信息流的传输。相似文献

20.

一种基于XML的集中会议控制协议

于玲王熙杜向军邰建华《微计算机信息》2005,(26)

在不同类型的会议系统之间的通信,重要的是建立一个独立于信号协议的会议控制协议。本文模拟了一个基于XML的会议控制协议,并把它作为标准机制,描述了在不同类型会议系统中的简单应用,并描述了具体的框架和简单操作。模拟结果显示:这个控制协议在一些中、小型会议的控制中,当增加会议数和参加者时它可以提供稳定的服务。相似文献