首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
We present a shared memory algorithm that allows a set of f+1 processes to wait-free “simulate” a larger system of n processes, that may also exhibit up to f stopping failures. Applying this simulation algorithm to the k-set-agreement problem enables conversion of an arbitrary k-fault-tolerant{\it n}-process solution for the k-set-agreement problem into a wait-free k+1-process solution for the same problem. Since the k+1-processk-set-agreement problem has been shown to have no wait-free solution [5,18,26], this transformation implies that there is no k-fault-tolerant solution to the n-process k-set-agreement problem, for any n. More generally, the algorithm satisfies the requirements of a fault-tolerant distributed simulation.\/ The distributed simulation implements a notion of fault-tolerant reducibility\/ between decision problems. This paper defines these notions and gives examples of their application to fundamental distributed computing problems. The algorithm is presented and verified in terms of I/O automata. The presentation has a great deal of interesting modularity, expressed by I/O automaton composition and both forward and backward simulation relations. Composition is used to include a safe agreement\/ module as a subroutine. Forward and backward simulation relations are used to view the algorithm as implementing a multi-try snapshot\/ strategy. The main algorithm works in snapshot shared memory systems; a simple modification of the algorithm that works in read/write shared memory systems is also presented. Received: February 2001 / Accepted: February 2001  相似文献   

2.
Synchronous atomic broadcast for redundant broadcast channels   总被引:4,自引:3,他引:1  
We propose a synchronous atomic broadcast protocol for distributed real-time systems based on redundant broadcast channels. The protocol can tolerate a finite number f of concurrent processor crash failures, channel adapter performance failures and channel omission failures. Its message cost is optimal: when no failures occur only f+1 messages are sent per broadcast. The cost implications of providing tolerance to other failure classes are also investigated.  相似文献   

3.
Summary. In a shared-memory distributed system, n independent asynchronous processes communicate by reading and writing to shared variables. An algorithm is adaptive (to total contention) if its step complexity depends only on the actual number, k, of active processes in the execution; this number is unknown in advance and may change in different executions of the algorithm. Adaptive algorithms are inherently wait-free, providing fault-tolerance in the presence of an arbitrary number of crash failures and different processes' speed. A wait-free adaptive collect algorithm with O(k) step complexity is presented, together with its applications in wait-free adaptive alg orithms for atomic snapshots, immediate snapshots and renaming. Received: August 1999 / Accepted: August 2001  相似文献   

4.
We consider the problem of how to schedule t similar and independent tasks to be performed in a synchronous distributed system of p stations communicating via multiple-access channels. Stations are prone to crashes whose patterns of occurrence are specified by adversarial models. Work, defined as the number of the available processor steps, is the complexity measure. We consider only reliable algorithms that perform all the tasks as long as at least one station remains operational. It is shown that every reliable algorithm has to perform work Ω(t+pt) even when no failures occur. An optimal deterministic algorithm for the channel with collision detection is developed, which performs work (t+pt). Another algorithm, for the channel without collision detection, performs work (t+pt+ p min {f,t}), where f < p is the number of failures. This algorithm is proved to be optimal, provided that the adversary is restricted in failing no more than f stations. Finally, we consider the question if randomization helps against weaker adversaries for the channel without collision detection. A randomized algorithm is developed which performs the expected minimum amount (t+pt) of work, provided that the adversary may fail a constant fraction of stations and it has to select failure-prone stations prior to the start of an execution of the algorithm. The work of the first author is supported by the NSF Grant 0310503. A preliminary version of this paper appeared as “The do-all problem in broadcast networks,” in Proceedings, 20th ACM Symposium on Principles of Distributed Computing, Newport, Rhode Island, 2001, pp. 117–126.  相似文献   

5.
Summary. Long-lived and adaptive implementations of mutual exclusion and renaming in the read/write shared memory model are presented. An implementation of a task is adaptive if the step complexity of any operation in the implementation is a function of the number of processes that take steps concurrently with the operation. The renaming algorithm assigns a new unique id in the range to any process whose initial unique name is taken from a set of size N, for an arbitrary N and where k is the number of processes that actually take steps or hold a name while the new name is being acquired. The step complexity of acquiring a new name is , while the step complexity of releasing a name is 1. The space complexity of the algorithm is where n is an upper bound on the number of processes that may be active at the same time (acquiring or holding new names), which could be N in the worst case. Both the system response time and the worst case number of operations per process in the presented mutual-exclusion algorithm are adaptive. Both algorithms rely on the basic building block of a long-lived and adaptive splitter. While the adaptive-splitter satisfies a slightly different set of properties than the Moir-Anderson splitter [MA95], it is adaptive and long-lived. In addition, the new splitter properties enable the construction of a non-blocking long-lived (2k-1)-renaming algorithm (which is optimal in the size of the new name space). We believe that the mechanisms introduced in our splitter implementation are interesting on their own, and might be used in other adaptive and long-lived constructions. Received: March 2000 / Accepted July 2001  相似文献   

6.
Summary. The problem of using P processes to write a given value to all positions of a shared array of size N is called the Write-All problem. We present and analyze an asynchronous algorithm with work complexity , where (assuming and ). Our algorithm is a generalization of the naive two-processor algorithm where the two processes each start at one side of the array and walk towards each other until they collide. Received: October 1999 / Accepted: September 2000  相似文献   

7.
We consider the problem of locating replicas in a network to minimize communications costs. Under the assumption that the read-one-write-all policy is used to ensure data consistency, an optimization problem is formulated in which the cost function estimates the total communications costs. The paper concentrates on the study of the optimal communications cost as a function of the ratio between the frequency of the read and write operations. The problem is reformulated as a zero-one linear programming problem, and its connection to the p-median problem is explained. The general problem is proved to be NP-complete. For path graphs a dynamic programming algorithm for the problem is presented. Received: May 1993 / Accepted: June 2001  相似文献   

8.
Failure detection and consensus in the crash-recovery model   总被引:2,自引:0,他引:2  
Summary. We study the problems of failure detection and consensus in asynchronous systems in which processes may crash and recover, and links may lose messages. We first propose new failure detectors that are particularly suitable to the crash-recovery model. We next determine under what conditions stable storage is necessary to solve consensus in this model. Using the new failure detectors, we give two consensus algorithms that match these conditions: one requires stable storage and the other does not. Both algorithms tolerate link failures and are particularly efficient in the runs that are most likely in practice – those with no failures or failure detector mistakes. In such runs, consensus is achieved within time and with 4 n messages, where is the maximum message delay and n is the number of processes in the system. Received: May 1998 / Accepted: November 1999  相似文献   

9.
This paper describes a parameterized distributed algorithm applicable to any directed graph topology. The function parameter of our algorithm is instantiated to produce distributed algorithms for both fundamental and high level applications, such as shortest path calculus and depth-first-search tree construction. Due to fault resilience properties of our algorithm, the resulting protocols are self-stabilizing at no additional cost. Self-stabilizing protocols can resist transient failures and guarantee system recovery in a finite time. Since the condition on the function parameter (being a strictly idempotent r-operator) permits a broad range of applications to be implemented, the solution presented in our paper can be useful for a large class of distributed systems. Received: August 1999 / Accepted: January 2001  相似文献   

10.
11.
Summary. This paper presents adaptive algorithms for mutual exclusion using only read and write operations; the performance of the algorithms depends only on the point contention, i.e., the number of processes that are concurrently active during the algorithm execution (and not on n, the total number of processes). Our algorithm has O(k) remote step complexity and O(logk) system response time, wherek is the point contention. The remote step complexity is the maximal number of steps performed by a process where a wait is counted as one step. The system response time is the time interval between subsequent entries to the critical section, where one time unit is the minimal interval in which every active process performs at least one step. The space complexity of this algorithm is O(N logn), where N is the range of processes' names. We show how to make the space complexity of our algorithm depend solely on n, while preserving the other performance measures of the algorithm. Received: March 2001 / Accepted: November 2001  相似文献   

12.
We consider the task of assigning distinct labels to nodes of an unknown anonymous network in a distributed manner. A priori, nodes do not have any identities, except for one distinguished node, called the source, and do not know the topology or the size of the network. They execute identical algorithms, apart from the source which plays the role of a leader and starts the labeling process. Our goal is to assign short labels, as fast as possible. The quality of a labeling algorithm is measured by the range from which the algorithm picks the labels, or alternatively, the length of the assigned labels. Natural efficiency measures are the time, i.e., the number of rounds required for the label assignment, and the message and bit complexities of the label assignment protocol, i.e., the total number of messages (resp., bits) circulating in the network. We present label assignment algorithms whose time and message complexity are asymptotically optimal and which assign short labels. On the other hand, we establish inherent trade-offs between quality and efficiency for labeling algorithms. Received: July 2000 / Accepted: February 2001  相似文献   

13.
The ability to cooperate on common tasks in a distributed setting is key to solving a broad range of computation problems ranging from distributed search such as SETI to distributed simulation and multi-agent collaboration. Do-All, an abstraction of such cooperative activity, is the problem of performing N tasks in a distributed system of P failure-prone processors. Many distributed and parallel algorithms have been developed for this problem and several algorithm simulations have been developed by iterating Do-All algorithms. The efficiency of the solutions for Do-All is measured in terms of work complexity where all processing steps taken by all processors are counted. Work is ideally expressed as a function of N, P, and f, the number of processor crashes. However the known lower bounds and the upper bounds for extant algorithms do not adequately show how work depends on f. We present the first non-trivial lower bounds for Do-All that capture the dependence of work on N, P and f. For the model of computation where processors are able to make perfect load-balancing decisions locally, we also present matching upper bounds. We define the r-iterative Do-All problem that abstracts the repeated use of Do-All such as found in typical algorithm simulations. Our f-sensitive analysis enables us to derive tight bounds for r-iterative Do-All work (that are stronger than the r-fold work complexity of a single Do-All). Our approach that models perfect load-balancing allows for the analysis of specific algorithms to be divided into two parts: (i) the analysis of the cost of tolerating failures while performing work under free load-balancing, and (ii) the analysis of the cost of implementing load-balancing. We demonstrate the utility and generality of this approach by improving the analysis of two known efficient algorithms. We give an improved analysis of an efficient message-passing algorithm. We also derive a tight and complete analysis of the best known Do-All algorithm for the synchronous shared-memory model. Finally we present a new upper bound on simulations of synchronous shared-memory algorithms on crash-prone processors.Received: 15 May 2002, Accepted: 15 June 2003, Published online: 6 February 2004This work is supported in part by the NSF TOC Grants 9988304 and 0311368, and the NSF ITR Grant 0121277. The work of the second author is supported in part by the NSF CAREER Award 0093065. The work of the third author is supported in part by the NSF CAREER Award 9984778.  相似文献   

14.
The success of the Object Management Group's General Inter-ORB Protocol (GIOP) is leading to the desire to deploy GIOP in an ever-wider range of application areas, many of which are significantly more demanding than traditional areas in terms of performance. The well-known performance limitations of present day GIOP-based object request brokers (ORBs) are therefore increasingly being seen as a problem. To help address this problem, this paper discusses a GIOP implementation which has high performance and quality of service support as explicit goals. The implementation, which is embedded in a research ORB called Gopi, is modular and extensible in nature and includes novel optimization techniques which should be separately portable to other ORB environments. This paper focuses on the message protocol aspects of Gopi's GIOP implementation; higher layer issues such as marshalling and operation demultiplexing are not covered in detail. Figures are provided which position Gopi's GIOP performance against comparable ORBs. The paper also discusses some of the design decisions that have been made in the development of the GIOP protocol in the light of our implementation experience. Received: May 2000 / Accepted: December 2000  相似文献   

15.
We consider the gossip problem in a synchronous message-passing system. Participating processors are prone to omission failures, that is, a faulty processor may fail to send or receive a message. The gossip problem in the fault-tolerant setting is defined as follows: every correct processor must learn the initial value of any other processor, unless the other one is faulty; in the latter case either the initial value or the information about the fault must be learned. We develop two efficient algorithms that solve the gossip problem in time O(logn), where n is the number of processors in the system. The first one is an explicit algorithm (i.e., constructed in polynomial time) sending O(nlogn+f2) messages, and the second one reduces the message complexity to O(n+f2), where f is the upper bound on the number of faulty processors.  相似文献   

16.
Summary. Group mutual exclusion occurs naturally in situations where a resource can be shared by processes of the same group, but not by processes of different groups. For example, suppose data is stored in a CD-jukebox. Then when a disc is loaded for access, users that need data on the disc can concurrently access the disc, while users that need data on a different disc have to wait until the current disc is unloaded. The design issues for group mutual exclusion have been modeled as the Congenial Talking Philosophers problem, and solutions for shared-memory models have been proposed [12,14]. As in ordinary mutual exclusion and many other problems in distributed systems, however, techniques developed for shared memory do not necessary apply to message passing (and vice versa). So in this paper we investigate solutions for Congenial Talking Philosophers in computer networks where processes communicate by asynchronous message passing. We first present a solution that is a straightforward adaptation from Ricart and Agrawala's algorithm for ordinary mutual exclusion. Then we show that the simple modification suffers a severe performance degradation that could cause the system to behave as though only one process of a group can be in the critical section at a time. We then present a more efficient and highly concurrent distributed algorithm for the problem, the first such solution in computer networks. Received: August 2000 / Accepted: November 2001  相似文献   

17.
We consider the problem of message (and bit) efficient broadcasting in complete networks with dynamic faults. Despite the simplicity of the setting, the problem turned out to be surprisingly interesting from the algorithmic point of view. In this paper we show an Ω(n + t f t/(t – 1)) lower bound on the number of messages sent by any t-step broadcasting algorithm, where f is the number of faults per step. The core of the paper contains a constructive O(n + t f (t + 1)/t ) upper bound. The algorithms involved are of time complexity O(t), not strictly t. In addition, we present a bit-efficient algorithm of O(n log2 n) bit and O(log n) time complexities. We also show that it is possible to achieve the same message complexity even if the nodes do not know the id’s of their neighbours, but instead have only a Weak Sense of Direction.  相似文献   

18.
Summary. In a distributed system, high-level actions can be modeled by nonatomic events. This paper proposes causality relations between distributed nonatomic events and provides efficient testing conditions for the relations. The relations provide a fine-grained granularity to specify causality relations between distributed nonatomic events. The set of relations between nonatomic events is complete in first-order predicate logic, using only the causality relation between atomic events. For a pair of distributed nonatomic events X and Y, the evaluation of any of the causality relations requires integer comparisons, where and , respectively, are the number of nodes on which the two nonatomic events X and Y occur. In this paper, we show that this polynomial complexity of evaluation can by simplified to a linear complexity using properties of partial orders. Specifically, we show that most relations can be evaluated in integer comparisons, some in integer comparisons, and the others in integer comparisons. During the derivation of the efficient testing conditions, we also define special system execution prefixes associated with distributed nonatomic events and examine their knowledge-theoretic significance. Received: July 1997 / Accepted: May 1998  相似文献   

19.
Asynchronous group mutual exclusion   总被引:1,自引:1,他引:0  
Abstract. Mutual exclusion and concurrency are two fundamental and essentially opposite features in distributed systems. However, in some applications such as Computer Supported Cooperative Work (CSCW) we have found it necessary to impose mutual exclusion on different groups of processes in accessing a resource, while allowing processes of the same group to share the resource. To our knowledge, no such design issue has been previously raised in the literature. In this paper we address this issue by presenting a new problem, called Congenial Talking Philosophers, to model group mutual exclusion. We also propose several criteria to evaluate solutions of the problem and to measure their performance. Finally, we provide an efficient and highly concurrent distributed algorithm for the problem in a shared-memory model where processes communicate by reading from and writing to shared variables. The distributed algorithm meets the proposed criteria, and has performance similar to some naive but centralized solutions to the problem. Received: November 1998 / Accepted: April 2000  相似文献   

20.
P so that each point of P is seen by at least one guard. We introduce and explore the edge-covering problem; the guards are required to observe the edges of P; metaphorically the paintings on the walls of the art gallery, and not necessarily every interior point. We compare minimum edge and interior covers for a given polygon and analyze the bounds and complexity for the edge-covering problem. We also introduce and analyze a restricted edge covering problem, where full visibility of each edge from at least one guard is required. For this problem we present an algorithm that computes a set of regions where a minimum set of guards must be located. The algorithm can also deal with the external visibility of a set of polygons.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号