期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A Time-Optimal Self-Stabilizing Synchronizer Using A Phase Clock

Awerbuch B. Kutten S. Mansour Y. Patt-Shamir B. Varghese G. 《Dependable and Secure Computing, IEEE Transactions on》2007,4(3):180-190

A synchronizer with a phase counter (sometimes called asynchronous phase clock) is an asynchronous distributed algorithm, where each node maintains a local "pulse counter" that simulates the global clock in a synchronous network. In this paper, we present a time-optimal self-stabilizing scheme for such a synchronizer, assuming unbounded counters. We give a simple rule by which each node can compute its pulse number as a function of its neighbors' pulse numbers. We also show that some of the popular correction functions for phase clock synchronization are not self-stabilizing in asynchronous networks. Using our rule, the counters stabilize in time bounded by the diameter of the network, without invoking global operations. We argue that the use of unbounded counters can be justified by the availability of memory for counters that are large enough to be practically unbounded and by the existence of reset protocols that can be used to restart the counters in some rare cases where faults will make this necessary. 相似文献

2.

A Timing Assumption and Two t-Resilient Protocols for Implementing an Eventual Leader Service in Asynchronous Shared Memory Systems

Antonio Fernández Ernesto Jiménez Michel Raynal Gilles Trédan 《Algorithmica》2010,56(4):550-576

This paper considers the problem of electing an eventual leader in an asynchronous shared memory system. While this problem has received a lot of attention in message-passing systems, very few solutions have been proposed for shared memory systems. As an eventual leader cannot be elected in a pure asynchronous system prone to process crashes, the paper first proposes to enrich the asynchronous system model with an additional assumption. That assumption (denoted AWB) is particularly weak. It is made up of two complementary parts. More precisely, it requires that, after some time, (1) there is a process whose write accesses to some shared variables be timely, and (2) the timers of (t−f) other processes be asymptotically well-behaved (t denotes the maximal number of processes that may crash, and f the actual number of process crashes in a run). The asymptotically well-behaved timer notion is a new notion that generalizes and weakens the traditional notion of timers whose durations are required to monotonically increase when the values they are set to increase (a timer works incorrectly when it expires at arbitrary times, i.e., independently of the value it has been set to). The paper then focuses on the design of t-resilient AWB-based eventual leader protocols. “t-resilient” means that each protocol can cope with up to t process crashes (taking t=n−1 provides wait-free protocols, i.e., protocols that can cope with any number of process failures). Two protocols are presented. The first enjoys the following noteworthy properties: after some time only the elected leader has to write the shared memory, and all but one shared variables have a bounded domain, be the execution finite or infinite. This protocol is consequently optimal with respect to the number of processes that have to write the shared memory. The second protocol guarantees that all the shared variables have a bounded domain. This is obtained at the following additional price: t+1 processes are required to forever write the shared memory. A theorem is proved which states that this price has to be paid by any protocol that elects an eventual leader in a bounded shared memory model. This second protocol is consequently optimal with respect to the number of processes that have to write in such a constrained memory model. In a very interesting way, these protocols show an inherent tradeoff relating the number of processes that have to write the shared memory and the bounded/unbounded attribute of that memory. 相似文献

3.

Fault-tolerance and self-stabilization: impossibility results and solutions using self-stabilizing failure detectors

JOFFROY BEAUQUIER SYNNÖVE KEKKONEN-MONETA 《International journal of systems science》2013,44(11):1177-1187

This paper focuses on protocols that are simultaneously resilient to permanent failures (crash faults) and transient failures (memory and message corruption). First, we show that asynchronous round-based and fault-tolerant protocols cannot be transformed into protocols that are simultaneously fault-tolerant and self-stabilizing (ftss), as is otherwise possible in the synchronous mode of computation. Secondly, we show that it is impossible to find the number of processes (i.e. the size) on a family of networks, as it has been proven for the ring network. Finally, we present a ftss protocol for solving ring size by assuming that each process accesses a failure detector. We also propose two self-stabilizing implementations for the failure detector that differ in their degree of tolerance to transient failures. 相似文献

4.

A simple proof of the necessity of the failure detector Σ to implement an atomic register in asynchronous message-passing systems

François Bonnet 《Information Processing Letters》2010,110(4):153-157

This paper presents a simple proof that shows that the quorum failure detector class (denoted Σ) is the weakest failure detector class required to implement an atomic read/write register in an asynchronous message-passing system prone to an arbitrary number of process crashes. This proof is based on a new reduction algorithm in which all the variables are bounded. 相似文献

5.

Wait-Free Clock Synchronization 总被引：1，自引：0，他引：1

S. Dolev J. L. Welch 《Algorithmica》1997,18(4):486-511

Multiprocessor computer systems are becoming increasingly important as vehicles for solving computationally expensive problems. Synchronization among the processors is achieved with a variety of clock configurations. A new notion of fault-tolerance for clock synchronization algorithms is defined, tailored to the requirements and failure patterns of shared memory multiprocessors. Algorithms in this class can tolerate any number of napping processors, where a napping processor can fail by repeatedly ceasing operation for an arbitrary time interval and then resume operation without necessarily recognizing that a fault has occurred. These algorithms guarantee that, for some fixed k, once a processor P has been working correctly for at least k time, then as long as P continues to work correctly, (1) P does not adjust its clock, and (2) P's clock agrees with the clock of every other processor that has also been working correctly for at least k time. Because a working processor must synchronize in a fixed amount of time regardless of the actions of the other processors, these algorithms are called wait-free. Another useful type of fault-tolerance is called self-stabilization: starting with an arbitrary state of the system, a self-stabilizing algorithm eventually reaches a point after which it correctly performs its task. Two wait-free clock synchronization algorithms are presented for a model with global clock pulses. The first one is self-stabilizing; the second one is not but it converges more quickly than the first one. The self-stabilizing algorithm requires each processor's communication register contents to be a part of the processor's state. This last requirement is proven necessary. A wait-free clock synchronization algorithm is also presented for a model with local clock pulses. This algorithm is not self-stabilizing. Received December 20, 1993; revised January 1995. 相似文献

6.

Fault-containing self-stabilizing distributed protocols

Sukumar Ghosh Arobinda Gupta Ted Herman Sriram V. Pemmaraju 《Distributed Computing》2007,20(1):53-73

Self-stabilization is an elegant approach for designing a class of fault-tolerant distributed protocols. A self-stabilizing protocol is guaranteed to eventually converge to a legitimate state after a transient fault. However, even a minor transient fault can cause vast disruption in the system before legitimacy is reached. This paper introduces the notion of fault-containment to address this particular weakness of self-stabilizing systems. Informally, a fault-containing self-stabilizing protocol, in addition to providing self- stabilization, contains the effects of faults. This ensures that disruption during recovery from faults, is proportional to the extent of the faults. The paper begins with a formal framework for specifying and evaluating fault-containing self-stabilizing protocols. The main result of the paper is a transformer that converts any non-reactive self-stabilizing protocol into an equivalent fault-containing self-stabilizing protocol that can repair any single fault in the system in O(1) time. For a large class of input protocols, the corresponding output protocols produced by the transformer have O(1) space overhead. The small time and space overhead make the fault-containing self-stabilizing protocol a practical alternative to the original self-stabilizing protocol. The transformer is based on a novel stabilizing timer paradigm that significantly simplifies the task of fault-containment. 相似文献

7.

A simple and fast asynchronous consensus protocol based on a weak failure detector 总被引：2，自引：0，他引：2

Michel Hurfin Michel Raynal 《Distributed Computing》1999,12(4):209-223

Summary. The Consensus problem is a fundamental paradigm for fault-tolerant asynchronous systems. It abstracts a family of problems known as Agreement (or Coordination) problems. Any solution to consensus can serve as a basic building block for solving such problems (e.g., atomic commitment or atomic broadcast). Solving consensus in an asynchronous system is not a trivial task: it has been proven (1985) by Fischer, Lynch and Paterson that there is no deterministic solution in asynchronous systems which are subject to even a single crash failure. To circumvent this impossibility result, Chandra and Toueg have introduced the concept of unreliable failure detectors (1991), and have studied how these failure detectors can be used to solve consensus in asynchronous systems with crash failures. This paper presents a new consensus protocol that uses a failure detector of the class . Like previous protocols, it is based on the rotating coordinator paradigm and proceeds in asynchronous rounds. Simplicity and efficiency are the main characteristics of this protocol. From a performance point of view, the protocol is particularly efficient when, whether failures occur or not, the underlying failure detector makes no mistake (a common case in practice). From a design point of view, the protocol is based on the combination of three simple mechanisms: a voting mechanism, a small finite state automaton which manages the behavior of each process, and the possibility for a process to change its mind during a round. Received: August 1997 / Accepted: March 1999 相似文献

8.

Early consensus in an asynchronous system with a weak failure detector 总被引：2，自引：0，他引：2

André Schiper 《Distributed Computing》1997,10(3):149-157

Summary. Consensus is one of the most fundamental problems in the context of fault-tolerant distributed computing. The problem consists, given a set Ω of processes having each an initial value v _i, in deciding among Ω on a common value v. In 1985, Fischer, Lynch and Paterson proved that the consensus problem is not solvable in an asynchronous system subject to a single process crash. In 1991, Chandra and Toueg showed that, by augmenting the asynchronous system model with a well defined unreliable failure detector, consensus becomes solvable. They also give an algorithm that solves consensus using the ◊? failure detector. In this paper we propose a new consensus algorithm, also using the ◊? failure detector, that is more efficient than the Chandra-Toueg consensus algorithm. We measure efficiency by introducing the notion of latency degree, which defines the minimal number of communication steps needed to solve consensus. The Chandra-Toueg algorithm has a latency degree of 3 (it requires at least three communication steps), whereas our early consensus algorithm requires only two communication steps (latency degree of 2). We believe that this is an interesting result, which adds to our current understanding of the cost of consensus algorithms based on ◊?. Received: April 1995 / Accepted: October 1996 相似文献

9.

Phase clocks for transient fault repair

Herman T. 《Parallel and Distributed Systems, IEEE Transactions on》2000,11(10):1048-1057

Phase clocks are synchronization tools that implement a form of logical time in distributed systems. For systems tolerating transient faults by self-repair of damaged data, phase clocks can enable reasoning about the progress of distributed repair procedures. This paper presents a phase clock algorithm suited to the model of transient memory faults in asynchronous systems with read/write registers. The algorithm is self-stabilizing and guarantees accuracy of phase clocks within O(k) time following an initial state that is. 相似文献

10.

A self-stabilizing algorithm for the maximum flow problem 总被引：5，自引：0，他引：5

Sukumar Ghosh Arobinda Gupta Sriram V. Pemmaraju 《Distributed Computing》1997,10(4):167-180

Summary. The maximum flow problem is a fundamental problem in graph theory and combinatorial optimization with a variety of important applications. Known distributed algorithms for this problem do not tolerate faults or adjust to dynamic changes in network topology. This paper presents a distributed self-stabilizing algorithm for the maximum flow problem. Starting from an arbitrary state, the algorithm computes the maximum flow in an acyclic network in finitely many steps. Since the algorithm is self-stabilizing, it is inherently tolerant to transient faults. It can automatically adjust to topology changes and to changes in other parameters of the problem. The paper presents results obtained by extensively experimenting with the algorithm. Two main observations based on these results are (1) the algorithm requires fewer than n ² moves for almost all test cases and (2) the algorithm consistently performs at least as well as a distributed implementation of the well-known Goldberg-Tarjan algorithm for almost all test cases. The paper ends with the conjecture that the algorithm correctly computes a maximum flow even in networks that contain cycles. Received: October 1995 / Accepted: February 1997 相似文献

11.

Synchronous vs. Asynchronous Unison

Christian Boulinier Franck Petit Vincent Villain 《Algorithmica》2008,51(1):61-80

This paper considers the self-stabilizing unison problem in uniform distributed systems. The contribution of this paper is threefold. First, we establish that when any self-stabilizing asynchronous unison protocol runs in synchronous systems, it converges to synchronous unison if the size of the clock K is greater than C _G, C _G being the length of the maximal cycle of the shortest maximal cycle basis if the graph contains cycles, 2 otherwise (tree networks). The second result demonstrates that the asynchronous unison in Boulinier et al. (In PODC ’04: Proceedings of the twenty-third annual ACM symposium on principles of distributed computing, pp. 150–159, 2004) provides a general self-stabilizing synchronous unison for trees which is optimal in memory space, i.e., it works with any K≥3, without any extra state, and stabilizes within 2D rounds, where D is the diameter of the network. This protocol gives a positive answer to the question whether there exists or not a general self-stabilizing synchronous unison for tree networks with a state requirement independent of local or global information of the tree. If K=3, then the stabilization time of this protocol is equal to D only, i.e., it reaches the optimal performance of Herman and Ghosh (Inf. Process. Lett. 54:259–265, 1995). The third result of this paper is a self-stabilizing unison for general synchronous systems. It requires K≥2 only, at least K+D states per process, and its stabilization time is 2D only. This is the best solution for general synchronous systems, both for the state requirement and the stabilization time. A preliminary version of this work was presented at the 7th International Symposium on Self-Stabilizing Systems (SSS 2005), Barcelona, Spain, October 2005. 相似文献

12.

Failure detection and consensus in the crash-recovery model 总被引：2，自引：0，他引：2

Marcos Kawazoe Aguilera Wei Chen Sam Toueg 《Distributed Computing》2000,13(2):99-125

Summary. We study the problems of failure detection and consensus in asynchronous systems in which processes may crash and recover, and links may lose messages. We first propose new failure detectors that are particularly suitable to the crash-recovery model. We next determine under what conditions stable storage is necessary to solve consensus in this model. Using the new failure detectors, we give two consensus algorithms that match these conditions: one requires stable storage and the other does not. Both algorithms tolerate link failures and are particularly efficient in the runs that are most likely in practice – those with no failures or failure detector mistakes. In such runs, consensus is achieved within time and with 4 n messages, where is the maximum message delay and n is the number of processes in the system. Received: May 1998 / Accepted: November 1999 相似文献

13.

Anonymous asynchronous systems: the case of failure detectors

François Bonnet Michel Raynal 《Distributed Computing》2013,26(3):141-158

Due to the multiplicity of loci of control, a main issue distributed systems have to cope with lies in the uncertainty on the system state created by the adversaries that are asynchrony, failures, dynamicity, mobility, etc. Considering message-passing systems, this paper considers the uncertainty created by the net effect of asynchrony and process crash failures in systems where the processes are anonymous (i.e., processes have no identity and locally execute the same algorithm). Trivially, agreement problems such as consensus, that cannot be solved in non-anonymous asynchronous systems prone to process failures, cannot be solved either if the system is anonymous. The paper investigates failure detectors that allow processes to circumvent this impossibility. It has several contributions. It first presents four failure detectors (denoted AP, ${\overline{AP}}$ , AΩ, and AΣ) and show that they are the “identity-free” counterparts of perfect failure detectors, eventual leader failure detectors, and quorum failure detectors, respectively. AΣ is new and showing that AΣ and Σ have the same computability power in a non-anonymous system is not trivial. The paper also shows that the notion of failure detector reduction is related to the computation model. Then, the paper presents and proves correct a uniform anonymous consensus algorithm based on the failure detector pair (AΩ, AΣ) (“uniform” means here that not only processes have no identity, but no process is aware of the total number of processes). This new algorithm is not a simple “straightforward extension” of an algorithm designed for non-anonymous systems. To benefit from AΣ, it uses a novel broadcast facility which encapsulates an AΣ-based message exchange pattern that provides the processes with an interesting intersection property on the set of messages they have exchanged. Finally, the paper discusses the notions of failure detector hierarchy, weakest failure detector for anonymous consensus, and the implementation of identity-free failure detectors in anonymous systems. 相似文献

14.

Failure detectors as type boosters

Rachid Guerraoui Petr Kouznetsov 《Distributed Computing》2008,20(5):343-358

The power of an object type T can be measured as the maximum number n of processes that can solve consensus using only objects of T and registers. This number, denoted cons(T), is called the consensus power of T. This paper addresses the question of the weakest failure detector to solve consensus among a number k > n of processes that communicate using shared objects of a type T with consensus power n. In other words, we seek for a failure detector that is sufficient and necessary to “boost” the consensus power of a type T from n to k. It was shown in Neiger (Proceedings of the 14th annual ACM symposium on principles of distributed computing (PODC), pp. 100–109, 1995) that a certain failure detector, denoted Ω _n, is sufficient to boost the power of a type T from n to k, and it was conjectured that Ω _n was also necessary. In this paper, we prove this conjecture for one-shot deterministic types. We first show that, for any one-shot deterministic type T with cons(T) ≤ n, Ω _n is necessary to boost the power of T from n to n + 1. Then we go a step further and show that Ω _n is also the weakest to boost the power of (n + 1)-ported one-shot deterministic types from n to any k > n. Our result generalizes, in a precise sense, the result of the weakest failure detector to solve consensus in asynchronous message-passing systems (Chandra et al. in J ACM 43(4):685–722, 1996). As a corollary, we show that Ω _t is the weakest failure detector to boost the resilience level of a distributed shared memory system, i.e., to solve consensus among n > t processes using (t − 1)-resilient objects of consensus power t. This paper is a revised and extended version of a paper that appeared in the Proceedings of the 17th International Symposium on Distributed Computing (DISC 2003), entitled “On failure detectors and type boosters.” 相似文献

15.

Secure and self-stabilizing clock synchronization in sensor networks

Jaap-Henk Hoepman Andreas Larsson Philippas Tsigas 《Theoretical computer science》2011,412(40):5631-5647

In sensor networks, correct clocks have arbitrary starting offsets and nondeterministic fluctuating skews. We consider an adversary that aims at tampering with the clock synchronization by intercepting messages, replaying intercepted messages (after the adversary’s choice of delay), and capturing nodes (i.e., revealing their secret keys and impersonating them). We present an efficient clock sampling algorithm which tolerates attacks by this adversary, collisions, a bounded amount of losses due to ambient noise, and a bounded number of captured nodes that can jam, intercept, and send fake messages. The algorithm is self-stabilizing, so if these bounds are temporarily violated, the system can efficiently stabilize back to a correct state. Using this clock sampling algorithm, we construct the first self-stabilizing algorithm for secure clock synchronization in sensor networks that is resilient to the aforementioned adversarial attacks. 相似文献

16.

Self-stabilizing minimum degree spanning tree within one from the optimal degree

Lélia BlinAuthor Vitae Maria Gradinariu Potop-ButucaruAuthor Vitae Stephane Rovedakis^{Author Vitae} 《Journal of Parallel and Distributed Computing》2011,71(3):438-449

We propose a self-stabilizing algorithm for constructing a Minimum Degree Spanning Tree (MDST) in undirected networks. Starting from an arbitrary state, our algorithm is guaranteed to converge to a legitimate state describing a spanning tree whose maximum node degree is at most Δ^∗+1, where Δ^∗ is the minimum possible maximum degree of a spanning tree of the network.To the best of our knowledge, our algorithm is the first self-stabilizing solution for the construction of a minimum degree spanning tree in undirected graphs. The algorithm uses only local communications (nodes interact only with the neighbors at one hop distance). Moreover, the algorithm is designed to work in any asynchronous message passing network with reliable FIFO channels. Additionally, we use a fine grained atomicity model (i.e., the send/receive atomicity). The time complexity of our solution is O(mn²logn) where m is the number of edges and n is the number of nodes. The memory complexity is O(δlogn) in the send-receive atomicity model (δ is the maximal degree of the network). 相似文献

17.

Self-stabilizing depth-first token circulation in arbitrary rooted networks 总被引：1，自引：0，他引：1

Ajoy K. Datta Colette Johnen Franck Petit Vincent Villain 《Distributed Computing》2000,13(4):207-218

Abstract. We present a deterministic distributed depth-first token passing protocol on a rooted network. This protocol uses neither the processor identifiers nor the size of the network, but assumes the existence of a distinguished processor, called the root of the network. The protocol is self-stabilizing, meaning that starting from an arbitrary state (in response to an arbitrary perturbation modifying the memory state), it is guaranteed to reach a state with no more than one token in the network. Our protocol implements a strictly fair token circulation scheme. The proposed protocol has extremely small state requirement – only states per processor, i.e., bits per processor, where is the degree of the network. The protocol can be used to implement a strictly fair distributed mutual exclusion in any rooted network. This protocol can also be used to construct a DFS spanning tree. Received: July 1998 / Accepted: April 2000 相似文献

18.

Robust self-stabilizing weight-based clustering algorithm

Colette Johnen Le Huy Nguyen 《Theoretical computer science》2009,410(6-7):581-594

Ad hoc networks consist of wireless hosts that communicate with each other in the absence of a fixed infrastructure. Such networks cannot rely on centralized and organized network management. The clustering problem consists of partitioning network nodes into non-overlapping groups called clusters. Clusters give a hierarchical organization to the network that facilitates network management and that increases its scalability.In a weight-based clustering algorithm, the clusterheads are selected according to their weight (a node’s parameter). The higher the weight of a node, the more suitable this node is for the role of clusterhead. In ad hoc networks, the amount of bandwidth, memory space or battery power of a node could be used to determine weight values.A self-stabilizing algorithm, regardless of the initial system configuration, converges to legitimate configurations without external intervention. Due to this property, self-stabilizing algorithms tolerate transient faults and they are adaptive to any topology change.In this paper, we present a robust self-stabilizing weight-based clustering algorithm for ad hoc networks. The robustness property guarantees that, starting from an arbitrary configuration, after one asynchronous round, the network is partitioned into clusters. After that, the network stays partitioned during the convergence phase toward a legitimate configuration where the clusters verify the “ad hoc clustering properties”. 相似文献

19.

Two-state self-stabilizing algorithms for token rings

Flatebo M. Datta A.K. 《IEEE transactions on pattern analysis and machine intelligence》1994,20(6):500-504

A self-stabilizing system is a network of processors, which, when started from an arbitrary (and possibly illegal) initial state, always returns to a legal state in a finite number of steps. This implies that the system can automatically deal with infrequent errors. One issue in designing self-stabilizing algorithms is the number of states required by each machine. This paper presents mutual exclusion algorithms which will be self-stabilizing while only requiring each machine in the network to have two states. The concept of a randomized central demon is also introduced in this paper. The first algorithm is a starting point where no randomization is needed (the randomized central demon is not necessary). The other two algorithms require randomization. The second algorithm builds on the first algorithm and reduces the number of network connections required. Finally, the number of necessary connections is again reduced yielding the final two-state, probabilistic algorithm for an asynchronous, unidirectional ring of processes 相似文献

20.

Communication adaptive self-stabilizing group membership service

Dolev S. Schiller E. 《Parallel and Distributed Systems, IEEE Transactions on》2003,14(7):709-720

This paper presents the first (randomized) algorithm for implementing self-stabilizing group communication services in an asynchronous system. Our algorithm converges rapidly to legal behavior and is communication adaptive, namely, the communication volume is high when the system recovers from the occurrence of faults and is low once a legal state is reached. Communication adaptability is achieved by a new technique that combines transient fault detectors. 相似文献