首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
has been shown to be the weakest realistic failure detector class needed to solve the consensus problem in an asynchronous distributed system prone to f<n process crashes in which communication is by message-passing. However, the only protocol that is known to meet this bound is based on three layers of protocol construction, and is therefore not efficient. This paper presents a surprisingly simple and very efficient direct message-passing implementation of a -based consensus protocol, and proves its correctness.  相似文献   

2.
Impossibility results and best-case lower bounds are proved for the number of message delays and the number of processes required to reach agreement in an asynchronous consensus algorithm that tolerates non-Byzantine failures. General algorithms exist that achieve these lower bounds in the normal case, when the response time of non-faulty processes and the transmission delay of messages they send to one another are bounded. Our theorems allow algorithms to do better in certain exceptional cases, and such algorithms are presented. Two of these exceptional algorithms may be of practical interest.  相似文献   

3.
Failure detectors have been shown to be a very useful mechanism to solve the consensus problem in the crash failure model, for which a number of communication-efficient algorithms have been proposed. In this paper we deal with the definition, implementation and use of communication-efficient failure detectors in the general omission failure model, where processes can fail by crashing and by omitting messages when sending and/or receiving. We first define a new failure detector class for this model in terms of completeness and accuracy properties. Then we propose an algorithm that implements a failure detector of the proposed class in a communication-efficient way, in the sense that only a linear number of links are used to send messages forever. We also explain how the well-known consensus algorithm of Chandra and Toueg can be adapted in order to use the proposed failure detector.  相似文献   

4.
Existing consensus protocols suffer from slowdowns caused by the failures of processes and the mistakes made by the underlying oracles. In this paper, we propose two novel techniques to circumvent such slowdowns in failure-detector-based consensus protocols. The first technique guarantees the Round-Zero-Degradation (RZD) property (an extension of the Zero-Degradation property) in order to avoid the slowdown caused by a failed coordinator process. The second technique, named “Look-Ahead”, helps speed up the execution of the consensus protocol by making use of the messages delivered before their receivers enter the corresponding phases or rounds. The first technique is effective only when the underlying failure detector makes no or few mistakes, while the second technique always works well regardless of the performance of the failure detector. Moreover, Look-Ahead is a general technique and can be applied to consensus protocols based on any kind of oracle. By applying the two proposed techniques, several consensus protocols are developed. The simulation results show that the RZD technique is effective even if the error rate of the failure detector reaches about 15%, while the Look-Ahead technique can always improve the performance in all cases.  相似文献   

5.
An indulgent algorithm is a distributed algorithm that tolerates asynchronous periods of the network when process crash detection is unreliable. This paper presents a tight bound on the time complexity of indulgent consensus algorithms. We consider a round-based eventually synchronous model, and we show that any t-resilient consensus algorithm in this model, requires at least t+2 rounds for a global decision even in runs that are synchronous. We contrast our lower bound with the well-known t+1 round tight bound on consensus in the synchronous model. We then prove the bound to be tight by exhibiting a new t-resilient consensus algorithm in the eventually synchronous model that reaches a global decision at round t+2 in every synchronous run. Our new algorithm is in this sense significantly faster than the most efficient indulgent algorithm we know of, which requires 2t+2 rounds in synchronous runs. Our lower bound applies to round-based consensus algorithms with unreliable failure detectors such as ⋄ P and ⋄ S, and our matching algorithm can be adapted to such failure detectors. This work is partially supported by the Swiss National Science Foundation (project number 510-207).  相似文献   

6.
A simple proof of the uniform consensus synchronous lower bound   总被引:1,自引:0,他引:1  
We give a simple and intuitive proof of an f+2 round lower bound for uniform consensus. That is, we show that for every uniform consensus algorithm tolerating t failures, and for every f?t−2, there is an execution with f failures that requires f+2 rounds.  相似文献   

7.
Fast Paxos   总被引:1,自引:0,他引:1  
As used in practice, traditional consensus algorithms require three message delays before any process can learn the chosen value. Fast Paxos is an extension of the classic Paxos algorithm that allows the value to be learned in two message delays. How and why the algorithm works are explained informally, and a TLA+ specification of the algorithm appears as an appendix.  相似文献   

8.
9.
Summary TheDistributed Consensus problem involvesn processors each of which holds an initial binary vlaue. At mostt of the processors may be faulty and ignore any protocol (even behaving maliciously), yet it is required that the non-faulty processors eventually agree on a value that was initially held by one of them. In this paper we focus on consensus in networks whose degree is bounded, following the work of Dwork, Peleg, Pippenger and Upfal [8]. In such a context, complete consensus among all the correct processors is not possible and some exceptions must be allowed. We first show how to achieve consensus in the butterfly network usingO(t+lognloglogn) one-bit parallel transmission steps, while tolerating the asymptotically optimal number of faulty processors (O(n/logn)) and having the asymptotically minimal number of exceptions (O(tlogt)). This result considerably improves on the running time of existing butterfly consensus protocols [2, 8]. In particular, it replaces the running time ofO(nlognloglogn) of [2] with an asymptotically optimal one. As in [8], we can then decrease the number of exceptions toO(t) by using additional links, while maintaining the same running time. The protocol is derived from a consensus protocol for completely connected networks that is interesting in its own right: it achieves Distributed Consensus with optimal number of processors, asymptotically optimal total bit transfer and nearly optimal number of rounds. Piotr Berman was born in 1955 in Warsaw, Poland, where here progressed from day-care to the degree of Master of Mathematics obtained from the University of Warsaw in 1978. He later studied at the Polish Academy of Sciences and MIT. He received a Ph.D. in Mathematics from MIT in 1985. From 1982 he has been teaching at Penn State, where currently he has a permanent position. His research interests are in two areas: fault-tolerant distributed computing and approximation algorithms. His non-professional hobbies include ancient history and mountain hiking. Juan Alberto Garay is originally from Rosario, Argentina, where he received his degree of Electrical Engineering from the Universidad Nacional de Rosario in 1976, at the age of 21. He then received his Master's degree in Electronic Engineering from the Eindhoven International Institute of the Eindhoven University of Technology (Eindhoven, Holland) in 1981, and his Ph.D. in Computer Science at Penn State University (University Park, PA) in 1989. Between his first and second degrees he worked as a digital design engineer for SOMISA (San Nicola's, Argentina), and between his second and Ph.D. degrees as a systems engineer for IBM Argentina. During the 1989/1990 academic year he was a visiting Assistant Professor at Bucknell University (Lewisburg, PA), and since 1990 he is with IBM's T.J. Watson Research Center (Yorktown Heights, NY). In 1992 he spent 6 months at The Weizmann Institute of Science (Rehovot, Israel) as a postdoctoral fellow. His professional interests include algorithms and lower bounds, distributed computation and fault tolerance. Dr. Garay enjoys tennis, music and philosophy.Preliminary version appeared in Proc 4th International Workshop on Distributed Algorithms, LNCS 486 (Springer-Verlag), pp 321–333, 1990  相似文献   

10.
Determining the “weakest” failure detectors is a central topic in solving many agreement problems such as Consensus, Non-Blocking Atomic Commit and Election in asynchronous distributed systems. So far, this has been studied extensively for several such fundamental problems. It is stated that Perfect Failure Detector P is the weakest failure detector to solve the Election problem with any number of faulty processes. In this paper, we introduce Modal failure detector M and show that to solve Election, M is the weakest failure detector to solve election when the number of faulty processes is less than ⌈n/2⌉. We also show that it is strictly weaker than P.
Sung Hoon ParkEmail:
  相似文献   

11.
12.
The concept of unreliable failure detector was introduced by Chandra and Toueg as a mechanism that provides information about process failures. This mechanism has been used to solve several agreement problems, such as the consensus problem. In this paper, algorithms that implement failure detectors in partially synchronous systems are presented. First two simple algorithms of the weakest class to solve the consensus problem, namely the Eventually Strong class (?S), are presented. While the first algorithm is wait-free, the second algorithm is f-resilient, where f is a known upper bound on the number of faulty processes. Both algorithms guarantee that, eventually, all the correct processes agree permanently on a common correct process, i.e. they also implement a failure detector of the class Omega (Ω). They are also shown to be optimal in terms of the number of communication links used forever. Additionally, a wait-free algorithm that implements a failure detector of the Eventually Perfect class (?P) is presented. This algorithm is shown to be optimal in terms of the number of bidirectional links used forever.  相似文献   

13.
Dwork, Lynch, and Stockmeyer [3] and Lamport [4] showed that, in order to solve Consensus in a distributed system, it is sufficient that the system behaves well during a finite period of time. In sharp contrast, Chandra, Hadzilacos, and Toueg [6] proved that a failure detector that, from some time on, provides “good” information forever is necessary. We explain that this apparent paradox is due to the two-layered structure of the failure detector model. This structure also has impact on comparison relations between failure detectors. In particular, we make explicit why the classic relation is neither reflexive nor extends the natural history-wise inclusion. Our point is to help understanding existing models and to study how they model real distributed systems in an accurate way.  相似文献   

14.
This paper presents two main contributions: semi-passive replication and Lazy Consensus. The former is a replication technique with parsimonious processing. It is based on the latter; a variant of Consensus allowing the lazy evaluation of proposed values.Semi-passive replication is a replication technique with parsimonious processing. This means that, in the normal case, each request is processed by only one single process. The most significant aspect of semi-passive replication is that it requires a weaker system model than existing techniques of the same family. For semi-passive replication, we give an algorithm based on the Lazy Consensus.Lazy Consensus is a variant of the Consensus problem that allows the lazy evaluation of proposed values, hence the name. The main difference with Consensus is the introduction of an additional property of laziness. This property requires that proposed values are computed only when they are actually needed. We present an algorithm based on Chandra and Toueg's Consensus algorithm for asynchronous distributed systems with a S failure detector.  相似文献   

15.
This short note shows how a simple extension of object types with consensus number 2 boosts them to an infinite consensus number. This extension is a simple embedding of a shared memory write within the base operation defining the corresponding type with consensus number 2. The style of this note is voluntarily informal.  相似文献   

16.
This paper presents a shared-memory self-stabilizing failure detector, asynchronous consensus and replicated state-machine algorithm suite, the components of which can be started in an arbitrary state and converge to act as a virtual state-machine. Self-stabilizing algorithms can cope with transient faults. Transient faults can alter the system state to an arbitrary state and hence, cause a temporary violation of the safety property of the consensus. Started in an arbitrary state, the long lived, memory bounded and self-stabilizing failure detector, asynchronous consensus, and replicated state-machine suite, presented in the paper, recovers to satisfy eventual safety and eventual liveness requirements. Several new techniques and paradigms are introduced. The bounded memory failure detector abstracts away synchronization assumptions using bounded heartbeat counters combined with a balance–unbalance mechanism. The practically infinite paradigm is introduced in the scope of self-stabilization, where an execution of, say, 264 sequential steps is regarded as (practically) infinite. Finally, we present the first self-stabilizing wait-free reset mechanism that ensures eventual safety and can be used to implement efficient self-stabilizing timestamps that are of independent interest.  相似文献   

17.
Easy proofs are given, of the impossibility of solving several consensus problems (Byzantine agreement, weak agreement, Byzantine firing squad, approximate agreement and clock synchronization) in certain communication graphs.It is shown that, in the presence ofm faults, no solution to these problems exists for communication graphs with fewer than 3m+1 nodes or less than 2m+1 connectivity. While some of these results had previously been proved, the new proofs are much simpler, provide considerably more insight, apply to more general models of computation, and (particularly in the case of clock synchronization) significantly strengthen the results.Michael J. Fischer is currently Professor of Computer Science at Yale University, New Haven, CT, where he heads the Theory of Computation Group. He is also Editor-in-Chief of the Journal of the Association for Computing Machinery. His research interests include theory of distributed systems, cryptographic protocols, and computational complexity.Dr. Fischer received the B. S. degree in matheamtics from the University of Michigan, Ann Arbor, in 1963, and the M. A. and Ph. D. degrees in applied mathematics from Harvard University, Cambridge, MA, in 1965 and 1968, respectively. He has taught previously at Carnegie-Mellon University, the Massachusetts Institute of Technology, and University of Washington.Nancy Lynch is currently Associate professor of Computer Science at M.I.T., and heads the Theory of Distributed Systems group in M.I.T.'s Laboratory for Computer Science. Her interests are in all aspects of distributed computing theory, including formal models, algorithms, analysis, and correctness proofs. Dr. Lynch received the B.S. degree in mathematics from Brooklyn College in 1968 and the Ph. D. degree in mathematics from M.I.T. in 1972. She has served on the faculty of Tufts University, the University of Southern California, Florida International University, Georgia Tech.Michael Merritt is currently a member of the technical staff with AT&T Bell Laboratories. During the 1984 –85 academic year, he was a visiting lecturer at M.I.T., sponsered by Bell Labs. His research interests include distributed computation, cryptography and security. Dr. Merritt received the B. S. degree in computer science and philosophy from Yale in 1978 and the M. Sc. and Ph. D. degrees in 1980 and 1983, respectively, both in information and computer science from Georgia Tech. He is a member of SIGACT and of Computer Professionals for Social Responsibility.This paper has appeared in the ACM Conference Proceedings of PODC 1985. © 1985, Association for Computing Machinery, reprinted by permission  相似文献   

18.
19.
In this paper, we present two bounded cost algorithms that solve multivalued consensus using binary consensus instances. Our first algorithm uses log2n number of binary consensus instances where n is the number of processes, while our second algorithm uses at most binary consensus instances, where is the maximum length of the binary representation of all proposed values in the run. Both algorithms are significant improvements over the previous algorithm in [A. Mostefaoui, M. Raynal, F. Tronel, From binary consensus to multivalued consensus in asynchronous message-passing systems, Information Processing Letters 73 (5–6) (2000) 207–212], where the number of binary consensus instances needed to solve one multivalued consensus is unbounded.  相似文献   

20.
The failure detector class Omega (Ω) provides an eventual leader election functionality, i.e., eventually all correct processes permanently trust the same correct process. An algorithm is communication-efficient if the number of links that carry messages forever is bounded by n, being n the number of processes in the system. It has been defined that an algorithm is crash-quiescent if it eventually stops sending messages to crashed processes. In this regard, it has been recently shown the impossibility of implementing Ω crash quiescently without a majority of correct processes. We say that the membership is unknown if each process pi only knows its own identity and the number of processes in the system (that is, i and n), but pi does not know the identity of the rest of processes of the system. There is a type of link (denoted by ADD link) in which a bounded (but unknown) number of consecutive messages can be delayed or lost.In this work we present the first implementation (to our knowledge) of Ω in partially synchronous systems with ADD links and with unknown membership. Furthermore, it is the first implementation of Ω that combines two very interesting properties: communication-efficiency and crash-quiescence when the majority of processes are correct. Finally, we also obtain with the same algorithm a failure detector () such that every correct process eventually and permanently outputs the set of all correct processes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号