首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 14 毫秒
1.
Software-based rerouting for fault-tolerant pipelined communication   总被引:1,自引:0,他引:1  
This paper presents a software-based approach to fault-tolerant routing in networks using wormhole or virtual cut-through switching. When a message encounters a faulty output link, it is removed from the network by the local router and delivered to the messaging layer of the local node's operating system. The message passing software can reroute this message, possibly along nonminimal paths. Alternatively, the message may be addressed to an intermediate node, which will forward the message to the destination. A message may encounter multiple faults and pass through multiple intermediate nodes. The proposed techniques are applicable to both obliviously and adaptively routed networks. The techniques are specifically targeted toward commercial multiprocessors where the mean time to repair (MTTR) is much smaller than the mean time between router failures (MTBF), i.e., it is sufficient to tolerate a maximum of three failures. This paper presents requirements for buffer management, deadlock freedom, and livelock freedom. Simulation results are presented to evaluate the degradation in latency and throughput as a function of the number and distribution of faults. There are several advantages of such an approach. Router designs are minimally impacted, and thus remain compact and fast. Only messages that encounter faulty components are affected, while the machine is ensured of continued operation until the faulty components can be replaced. The technique leverages existing network technology, and the concepts are portable across evolving switch and router designs. Therefore, we feel that the technique is a good candidate for incorporation into the next generation of multiprocessor networks  相似文献   

2.
金信苗 《微计算机信息》2006,22(23):263-265
本文叙述了按照分布式动态更换通讯模式协议,简称DDCCPP(DistributedDynamicChangeCommunicationPatternProto-col)设计和实现的实用于分布式容错计算机的通讯系统,它为系统提供了可靠的硬核,实际运行结果令人满意。  相似文献   

3.
Supersystems are shown to provide enough computational power to solve complex problems on a real-time basis. In all these systems, the computational parallelism is obtained from multiple processors. Multistage interconnection networks (MINs) play a vital role on the performance of these multiprocessor systems. This paper introduces a new fault-tolerant MIN named as improved extra group network (IEGN). IEGN is designed by existing extra group (EGN) network, which is a regular multipath network with limited fault tolerance. IEGN provides four times more paths between any source–destination pairs compared with EGN. The performance of IEGN has been evaluated in terms of permutation capability, fault tolerance, reliability, path length, and cost. It has also been proved that the IEGN can achieve better results in terms of fault tolerance, reliability, path length and cost-effectiveness, in comparison to known networks, namely, EGN, augmented baseline network, augmented shuffle-exchange network, fault-tolerant double tree, Benes network, and Replicated MIN.  相似文献   

4.
Using AVL trees for fault-tolerant group key management   总被引:1,自引:0,他引:1  
In this paper we describe an efficient algorithm for the management of group keys for group communication systems. Our algorithm is based on the notion of key graphs, previously used for managing keys in large Internet-protocol multicast groups. The standard protocol requires a centralized key server that has knowledge of the full key graph. Our protocol does not delegate this role to any one process. Rather, members enlist in a collaborative effort to create the group key graph. The key graph contains n keys, of which each member learns log2n of them. We show how to balance the key graph, a result that is applicable to the centralized protocol. We also show how to optimize our distributed protocol, and provide a performance study of its capabilities. Published online: 26 October 2001  相似文献   

5.
Dynamic group communication   总被引:1,自引:0,他引:1  
Group communication is the basic infrastructure for implementing fault-tolerant replicated servers. While group communication is well understood in the context of static groups (in which the membership does not change), current specifications of dynamic group communication (in which processes can join and leave groups during the computation) have not yet reached the same level of maturity. The paper proposes new specifications – in the primary partition model – for dynamic reliable broadcast (simply called “reliable multicast”), dynamic atomic broadcast (simply called “atomic multicast”) and group membership. In the special case of a static system, the new specifications are identical to the well known static specifications. Interestingly, not only are these new specifications “syntactically” close to the static specifications, but they are also “semantically” close to the dynamic specifications proposed in the literature. We believe that this should contribute to clarify a topic that has always been difficult to understand by outsiders. Finally, the paper shows how to solve atomic multicast, group membership and reliable broadcast. The solution of atomic multicast is close to the (static) atomic broadcast solution based on reduction to consensus. Group membership is solved using atomic multicast. Reliable multicast can be efficiently solved by relying on a thrifty generic multicast algorithm. Andrée Schiper graduated in Physics from the ETHZ in Zurich in 1973 and received the PhD degree in Computer Science from the EPFL (Federal Institute of Technology in Lausanne, Switzerland) in 1980. He has been a professor of computer science at EPFL since 1985, leading the Distributed Systems Laboratory. During the academic year 1992–1993, he was on sabbatical leave at the University of Cornell, Ithaca, New York, and in 2004-2005 at the Ecole Polytechnique near Paris. His research interests are in the area of dependable distributed systems, middleware support for dependable systems, replication techniques (including for database systems), group communication, distributed transactions, and, recently MANETs (mobile ad-hoc networks). From 2000 to 2002, he was the chair of the steering committee of the International Symposium on Distributed Computing (DISC). He has taken part in several European projects. He is currently a member of the editorial board of Distributed Computing, and of IEEE Transactions on Dependable and Secure Computing.  相似文献   

6.
With the fast development of network applications, there are more asynchronous distributed systems and more requirements for fault tolerance. Asynchrony means there is no upper bound for either message transfer or operation execution. Active replication is an effective means to enhance fault tolerant capability in distributed systems. A key component in a system is replicated and all the replicas make up of a fault-tolerant group. Members in such a group execute all client requests and then re…  相似文献   

7.
Abstract  In a web group-learning environment, students must communicate with other group members on the Internet to accomplish group projects and share knowledge. Communication is likely to affect performance and so analysing the relationship between communicative relationships and group performance may help teachers to monitor groups effectively. Certain tasks are necessary to perform such an analysis — recording group communication, extracting communication relationships and determining the relationship between group communication and group performance. This study developed a method for determining relationships and rules for predicting performance to enable teachers to take act appropriately according to the predicted performance of the group. Four group performance indicators are considered — average grades within a group, project grade, frequency of resource-sharing and drop-out rate. Experimental results are presented, concerning the application of the methodology to a web class of 706 students, divided into 70 groups. The experimental results show that group communication patterns significantly affect group performance.  相似文献   

8.
利用社会计算相关知识,构建了群体沟通仿真平台。实验结果表明,仿真平台的运行结果与分析实际数据的结果相似,主要表现为:话题数与话题的浏览数、回复数的分布分别具有幂律分布特征;话题以不同的模式进行传播。基于构建的群体沟通仿真平台,研究了群体沟通的影响策略,定量分析了三种不同的影响策略,实验结果对促进群体沟通具有一定作用。  相似文献   

9.
A space-efficient Information Dispersal Algorithm (IDA) is applied to fault-tolerant parallel communication in the hypercube. LetN denote the size of the network. Our routing scheme runs in 2·logN+1 time using constant size buffers (if the routing information is not counted). Its probability of successful routing is at least 1–N –2.419·logN+1.5, proving Rabin's conjecture. The scheme runswithin the said time bound without queueing delay, and it toleratesO(N) random link failures with high probability.Optimal on-line and efficient wire maintenance on the hypercube can be realized if our fault-tolerant routing scheme is used. Let denote the total number of links in the hypercube. It is shown that a constant fraction (/352) of the wires can be disabled simultaneously without disrupting the ongoing computation or degrading the routing performance much. This property suggests various on-line maintenance procedures.This research was supported by NSF Grant MCS-8121431 at Harvard University. This paper is based on Chapters 4, 5, and 8 of the author's Ph.D. dissertation.  相似文献   

10.
Reliable messaging is a key component necessary for mobile agent systems. Current researches focus on reliable one-to-one message delivery to mobile agents. But how to implement a group communication system for mobile agents remains an open issue, which is a powerful block that facilitates the development of fault-tolerant mobile agent systems. In this paper, we propose a group communication system for mobile agents (GCS-MA), which includes totally ordered multicast and membership management functions. We divide a group of mobile agents into several agent clusters, and each agent cluster consists of all mobile agents residing in the same sub-network and is managed by a special module, named coordinator. Then, all coordinators form a ring-based overlay for interchanging messages between clusters. We present a token-based algorithm, an intra-cluster messaging algorithm and an inter-cluster migration algorithm to achieve atomicity and total ordering properties of multicast messages, by building a membership protocol on top of the clustering and failure detection mechanisms. Performance issues of the proposed system have been analysed through simulations. We also describe the application of the proposed system in the context of the service cooperation middleware (SCM) project.  相似文献   

11.
《Advanced Robotics》2013,27(8):759-779
A novel design method of robot behavior is discussed to realize efficient local communication for cooperation of multiple mobile robots. Local communication is now increasingly utilized in cooperative many-robot systems because of its advantages of load distribution and simple implementation. In its usage, the design of each robot's behavior is a very important issue since it has a significant effect upon the communication efficiency in a collective manner. In this study, we introduce a simple group behavior and analyze how it improves the performance of local communication among many mobile robots. The performance is evaluated using the information transmission time that plays a crucial part in effective cooperation. Next, the optimal group size is analytically derived by minimizing the transmission time. The effectiveness of the analytical design method is verified by computer simulations of many-robot communication.  相似文献   

12.
A distributed fault-tolerant strategy for the controller area network based electric swing system of hybrid excavators is proposed to achieve good performance under communication errors based on the adaptive compensation of the delays and packet dropouts. The adverse impacts of communication errors are effectively reduced by a novel delay compensation scheme, where the feedback signal and the control command are compensated in each control period in the central controller and the swing motor driver, respectively, without requiring additional network bandwidth. The recursive least-squares algorithm with forgetting factor algorithm is employed to identify the time-varying model parameters due to pose variation, and a reverse correction law is embedded into the feedback compensation in consecutive packet dropout scenarios to overcome the impacts of the model error. Simulations and practical experiments are conducted. The results show that the proposed fault-tolerant strategy can effectively reduce the communication-error-induced overshoot and response time variation.  相似文献   

13.
近年来,网格计算技术日益成为用来解决数据和计算密集型应用的可行方案,网格运行平台本身和在网格环境中的并行应用都需要大量的点对多点的群组通信.提出一种灵活、可容错的群组通信机制.该机制基于远程方法调用(RMI),可为分布式并行应用提供高效、可容错的群组通信.通信方法可以在本地对象、远程对象,或一组对象中激活.这种通信采用异步方式,通信发起者可以选择全等待或必要性等待两种机制来获取通信结果.从而最大程度地保证通信的可靠性或高效性.  相似文献   

14.
This paper presents the first self-stabilizing group membership service, multicast service, and resource allocation service for directed networks. The first group communication algorithm is based on a token circulation over a virtual ring. The second algorithm is based on construction of distributed spanning trees. In addition, a technique is presented that emulates, in a self-stabilizing fashion, any undirected communication network over strongly connected directed networks, is presented. A resource allocation asynchronous algorithm for strongly connected directed networks is presented.Received: 23 July 2003, Published online: 29 June 2004Partially supported by NSF Award CCR-0098305, IBM faculty award, STRIMM consortium, and Israel ministry of defense.  相似文献   

15.
基于JMS规范的群组通信中间件的研究   总被引:2,自引:0,他引:2  
彭珍  曾广周 《计算机工程与设计》2005,26(7):1726-1728,1731
群组通信是计算机支持的协同45(CSCW)的一个重要组成部分,但传统的JMS发布/订阅机制是一种异步通信方式,不能满足CSCW应用对群组通信的同步要求。通过在JMS基础上增加成员关系服务和组播服务,实现了一种基于JMS规范的群组通信消息中间件JGCM。  相似文献   

16.
论文从容忍入侵、复制技术和群组通信三者之间内在的关联出发,探讨了容忍入侵系统中群组通信的必要性,提出了容侵群组通信的系统模型,从群组通信在整个系统中的位置和群组通信内部结构两个层面考虑,通过形式化定义,设计出了容侵群组通信的体系结构,并与OSI/RM进行对照研究。为可靠多播、全序多播和群组密钥管理等问题的深入研究奠定了基础。  相似文献   

17.
Group communication protocols (GCPs) play an important role in the design of modern distributed systems. A typical GCP exchanges control messages to provide message delivery guarantees, and a key point in the configuration of such a protocol is to establish the right trade-off between message overhead and delivery latency. This trade-off becomes even a greater challenge in systems where computing resources and application requirements may change at runtime. In such scenarios, the configuration of a GCP must be continuously re-adjusted to attain certain performance goals, or to adapt to current resource availability. This paper addresses this challenge by proposing self-managing mechanisms based on feedback control theory to a GCP especially designed to be self-manageable; in the proposed protocol, message overhead and delivery latency can be adjusted at runtime to follow some new operating set-point. The evaluation performed under varied scenarios shows the effectiveness of our approach.  相似文献   

18.
Group communication is a useful mechanism guaranteeing consistency among replicated objects. The existing approaches do not allow transparent plug-in of group communication protocols into CORBA. They either require modification of CORBA or OS, or provide no room for incorporating group communication transport protocols into CORBA. We thus propose a generic group communication framework that allows transparent plug-in of various group communication protocols with no modification of existing CORBA. We extend the open communications interface (OCI) to support interoperability, reusability of existing group communication, and independency on ORB and OS. We also define the group communication inter-ORB protocol (GCIOP) as a group communication instantiation of the general inter-ORB protocol (GIOP) that encapsulates underlying group communication protocols. The proposed scheme can be exploited for fault-tolerant CORBA (FT CORBA).  相似文献   

19.
Current group communication services have mostly been implemented on a homogeneous, distributed computing environment. This limits their applicability because most modern distributed computing environment are heterogeneous in nature. This paper describes the design, implementation, and performance evaluation of a CORBA group communication service. Using CORBA to implement a group communication service enables that group communication service to operate in a heterogeneous, distributed computing environment. To evaluate the effect of CORBA on the performance of a group communication service, this paper provides a detailed comparison of the performance measured from three implementations of an atomic broadcast protocol and a group membership protocol. Two of these implementations use CORBA, while the third uses UDP sockets for interprocess communication. The main conclusion is that heterogeneity can be achieved in group communication services by implementing them using CORBA, but there is a substantial performance cost. This performance cost can be reduced to a certain extent by carefully choosing a design and tuning various protocol parameters such as buffer sizes and timer values  相似文献   

20.
The study reports results from an experiment investigating aspects of communicative processes, using face-to-face (FtF) communication and computer-mediated communication (CMC). The latter was performed in two variants: participants writing under their own names or participants writing anonymously. There were two problems to be solved, both having ambiguous solutions. The theoretical aim was to determine if gender would influence communication equality, social relations, and communicative processes. Furthermore, private and public self-awareness was studied in order to identify differences between the media and between the sexes. The results show that participants discussing FtF were more private self-aware than participants in CMC, and females were more private self-aware than males. Females produced more messages in FtF communication than they did in CMC, and there were also more opinion change from females than from males. Social judgements were more positive from females than from males. A qualitative analysis showed that females expressed more opinions and agreements in FtF communication than in CMC, but also that they agreed more than males in responding to messages from a male. There were also more disagreements in FtF communication than in CMC.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号