首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 328 毫秒
1.
This paper proves Ω(m) lower bounds on the step complexity of UPDATE operations for space-optimal wait-free implementations of an m-component snapshot object from historyless objects. These lower bounds follow from lower bounds for a new, more general class of implementations from base objects of any type. This work extends a similar lower bound by Israeli and Shirazi for implementations of m-component single-writer snapshot objects from single-writer registers.  相似文献   

2.
We give matching upper and lower bounds of \(\varTheta(\min(\frac{\log m}{\log \log m},\, n))\) for the individual step complexity of a wait-free m-valued adopt-commit object implemented using multi-writer registers for n anonymous processes. While the upper bound is deterministic, the lower bound holds for randomized adopt-commit objects as well. Our results are based on showing that adopt-commit objects are equivalent, up to small additive constants, to a simpler class of objects that we call conflict detectors. Our anonymous lower bound also applies to the individual step complexity of m-valued wait-free anonymous consensus, even for randomized algorithms with global coins against an oblivious adversary. The upper bound can be used to slightly improve the cost of randomized consensus against an oblivious adversary. For deterministic non-anonymous implementations of adopt-commit objects, we show a lower bound of \(\varOmega(\min(\frac{\log m}{\log \log m}, \frac{\sqrt{\log n}}{\log \log n}))\) and an upper bound of \(O(\min(\frac{\log m}{\log \log m},\, \log n))\) on the worst-case individual step complexity. For randomized non-anonymous implementations, we show that any execution contains at least one process whose steps exceed the deterministic lower bound.  相似文献   

3.
Summary.  We present the first shared-memory algorithms for k-exclusion in which all process blocking is achieved through the use of “local-spin” busy waiting. Such algorithms are designed to reduce interconnect traffic, which is important for good performance. Our k-exclusion algorithms are starvation-free, and are designed to be fast in the absence of contention, and to exhibit scalable performance as contention rises. In contrast, all previous starvation-free k-exclusion algorithms require unrealistic operations or generate excessive interconnect traffic under contention. We also show that efficient, starvation-free k-exclusion algorithms can be used to reduce the time and space overhead associated with existing wait-free shared object implementations, while still providing some resilience to delays and failures. The resulting “hybrid” object implementations combine the advantages of local-spin spin locks, which perform well in the absence of process delays (caused, for example, by preemptions), and wait-free algorithms, which effectively tolerate such delays. We present performance results that confirm that this k-exclusion-based technique can improve the performance of existing wait-free shared object implementations. These results also show that lock-based implementations can be susceptible to severe performance degradation under multiprogramming, while our hybrid implementations are not. Received: December 1995 / Accepted: February 1997  相似文献   

4.
We consider asynchronous multiprocessors where processes communicate only by reading or writing shared memory. We show how to implement consensus, compare-and-swap and other comparison primitives, as well as load-linked/store-conditional (LL/SC) using only a constant number of remote memory references (RMRs), in both the cache-coherent and the distributed-shared-memory models of such multiprocessors. Our implementations are blocking, rather than wait-free: they ensure progress provided all processes that invoke the implemented primitive are live. Our results imply that any algorithm using read and write operations, and either comparison primitives or LL/SC, can be simulated by an algorithm that uses read and write operations only, with at most a constant-factor increase in RMR complexity.  相似文献   

5.
This paper extends Common2, the family of objects that implement and are wait-free implementable from 2 consensus objects, in two ways: First, the stack object is shown to be in the family, refuting a conjecture to the contrary [6]. Second, Common2 is investigated in the unbounded concurrency model, whereas until now it was considered only in an n-process model. We show that the fetch-and-add, test-and-set , and stack objects are in Common2 even with respect to this stronger notion of wait-free implementation. Our constructions rely on a wait-free implementation of immediate snapshots in the unbounded concurrency model, which was previously not known to be possible. The introduction of unbounded concurrency to the study of Common2 opens several directions of research: are there objects that have n-process implementations but are not unbounded concurrency implementable? We conjecture that swap is such an object. Additionally, the hope is that a queue impossibility proof, which eludes us in the n-process model, will be easier to establish in the unbounded concurrency model.  相似文献   

6.
This paper introduces solo-valency, a variation on the valency proof technique originated by Fischer, Lynch, and Paterson. The new technique focuses on critical events that influence the responses of solo runs by individual operations, rather than on critical events that influence a protocol’s single decision value. It allows us to derive lower bounds on the time to perform an operation for lock-free implementations of concurrent objects such as linearizable queues, stacks, sets, hash tables, counters, approximate agreement, and more. Time is measured as the number of distinct base objects accessed and the number of stalls caused by contention in accessing memory, incurred by a process as it performs a single operation. We introduce the influence level metric that quantifies the extent to which the response of a solo execution of one process can be changed by other processes. We then prove the existence of a relationship between the space complexity, latency, contention and influence level of all lock-free object implementations. Our results are broad in that they hold for implementations that may use any collection of read-modify-write operations in addition to read and write, and in that they apply even if base objects have unbounded size. Part of this work was done while the first author was a graduate student in the Department of Computer Science, Tel-Aviv University. This work was supported in part by a grant from Sun Microsystems.  相似文献   

7.
We present Byzantine Disk Paxos, an asynchronous shared-memory consensus algorithm that uses a collection of n < 3t disks, t of which may fail by becoming non-responsive or arbitrarily corrupted. We give two constructions of this algorithm; that is, we construct two different t-tolerant (i.e., tolerating up to t disk failures) building blocks, each of which can be used, along with a leader oracle, to solve consensus. One building block is a t-tolerant wait-free shared safe register. The second building block is a t-tolerant regular register that satisfies a weaker termination (liveness) condition than wait freedom: its write operations are wait-free, whereas its read operations are guaranteed to return only in executions with a finite number of writes. We call this termination condition finite writes (FW), and show that wait-free consensus is solvable with FW-terminating registers and a leader oracle. We construct each of these t-tolerant registers from n < 3t base registers, t of which can be non-responsive or Byzantine. All the previous t-tolerant wait-free constructions in this model used at least 4t + 1 fault-prone registers, and we are not familiar with any prior FW-terminating constructions in this model. We further show tight lower bounds on the number of invocation rounds required for optimal resilience reliable register constructions, or more generally, constructions that use less than 4t + 1 fault-prone registers. Our lower bounds show that such constructions are inherently more costly than constructions that use 4t + 1 registers, and that our constructions have optimal round complexity. Furthermore, our wait-free construction is early-stopping, and it achieves the optimal round complexity with any number of actual failures. A preliminary version of this paper, by the same authors and with the same title, appears in Proceedings of the 23rd ACM Symposium on Principles of Distributed Computing (PODC ’04), July 2004, pages 226–235.  相似文献   

8.
This paper presents lower bounds on the time and space complexity of implementations that use k-compare&swap (k-CAS) synchronization primitives. We prove that using k-CAS primitives can improve neither the time nor the space complexity of implementations of widely used concurrent objects, such as counter, stack, queue, and collect. Surprisingly, overly restrictive use of k-CAS may even increase the space complexity required by such implementations. We prove a lower bound of Omega (log_2 n) on the round complexity of implementations of a collect object using read, write, and k-CAS, for any k, where n is the number of processes in the system. There is an implementation of collect with O(log_2 n) round complexity that uses only reads and writes. Thus, our lower bound establishes that k-CAS is no stronger than read and write for collect implementation round complexity. For k-CAS operations that return the values of all the objects they access, we prove that the total step complexity of implementing key objects such as counters, stacks, and queues is Omega (n log_k n). We also prove that k-CAS cannot improve the space complexity of implementing many objects (including counter, stack, queue, and single-writer snapshot). An implementation has to use at least n base objects even if k-CAS is allowed, and if all operations (other than read) swap exactly k base objects, then it must use at least k cdot n base objects.  相似文献   

9.
This paper presents an algorithm for implementing a k-valued regular register (the logical register) using binary regular registers (the physical registers) that requires only one physical write per logical write. The same algorithm using binary atomic registers implements a k-valued atomic register. The algorithm is simple to describe and depends on properties of paths in a related graph. Two lower bounds are given on the number of registers required by one-write implementations in the regular case. The first lower bound, , holds for a fairly general class of algorithms. The second lower bound holds for a restricted class of implementations and implies that our algorithm is optimal for this class. Both lower bounds improve on the best previously known lower bound, which was k. The two lower bounds also hold for the atomic case under further restrictions. Received: 9 June 2000  相似文献   

10.
It is considered good distributed computing practice to devise object implementations that tolerate contention, periods of asynchrony and a large number of failures, but perform fast if few failures occur, the system is synchronous and there is no contention. This paper initiates the first study of quorum systems that help design such implementations by encompassing, at the same time, optimal resilience, as well as optimal best-case complexity. We introduce the notion of a refined quorum system (RQS) of some set S as a set of three classes of subsets (quorums) of S: first class quorums are also second class quorums, themselves being also third class quorums. First class quorums have large intersections with all other quorums, second class quorums typically have smaller intersections with those of the third class, the latter simply correspond to traditional quorums. Intuitively, under uncontended and synchronous conditions, a distributed object implementation would expedite an operation if a quorum of the first class is accessed, then degrade gracefully depending on whether a quorum of the second or the third class is accessed. Our notion of refined quorum system is devised assuming a general adversary structure, and this basically allows algorithms relying on refined quorum systems to relax the assumption of independent process failures, often questioned in practice. We illustrate the power of refined quorums by introducing two new optimal Byzantine-resilient distributed object implementations: an atomic storage and a consensus algorithm. Both match previously established resilience and best-case complexity lower bounds, closing open gaps, as well as new complexity bounds we establish here. Each of our algorithms is representative of a different class of architectures, highlighting the generality of the refined quorum abstraction.  相似文献   

11.
We present lock-free and wait-free universal constructions for implementing large shared objects. Most previous universal constructions require processes to copy the entire object state, which is impractical for large objects. Previous attempts to address this problem require programmers to explicitly fragment large objects into smaller, more manageable pieces, paying particular attention to how such pieces are copied. In contrast, our constructions are designed to largely shield programmers from this fragmentation. Furthermore, for many objects, our constructions result in lower copying overhead than previous ones. Fragmentation is achieved in our constructions through the use of load-linked, store-conditional, and validate operations on a “large” multiword shared variable. Before presenting our constructions, we show how these operations can be efficiently implemented from similar one-word primitives  相似文献   

12.
This work addresses the possibility or impossibility, and the corresponding costs, of devising concurrent, low-contention implementations of atomic Read&Modify&Write (or RMW) operations in a distributed system. A natural class of monotone RMW operations associated with monotone groups, a certain class of algebraic groups introduced here, is considered. The popular Fetch&Add and Fetch&Multiply operations are examples from the class.A Monotone Linearizability Lemma is proved and employed as a chief combinatorial instrument in this work; it establishes inherent ordering constraints of linearizability for a certain class of executions of any distributed system implementing a monotone RMW operation.The end results of this work specifically apply to implementations of (monotone) RMW operations that are based on switching networks, a recent class of concurrent, low-contention data structures that generalize counting networks (J. ACM 41(5) (1994) 1020–1048) (which implemented the traditional Fetch&Increment operation). These results are negative; they are shown through the Monotone Linearizability Lemma. In particular, the first lower bounds on size (the number of switches in the network) for any (non-trivial) switching network implementing a monotone RMW operation are derived. It is proven that if the network incurs low contention, then its size must be infinite, no matter whether the number of states of each switch is finite or infinite. Since Fetch&Increment is implementable with counting networks of finite-size (J. ACM 41(5) (1994) 1020–1048), these lower bounds imply a space complexity separation between Fetch&Increment and any monotone RMW operation in the model of switching networks.The presented lower bounds provide a mathematical explanation for the observed inability of researchers over the last thirteen years to extend counting networks, while keeping their finite-size, high-concurrency and low-contention, in order to perform tasks more complex than Fetch&Increment but yet as simple as Fetch&Add.  相似文献   

13.
Object Inlining (OI) is a known optimization in object oriented programming in which referenced objects of class B are inlined into their referencing objects of class A by making all fields and methods of class B part of class A. The optimization saves all the new operations of B type objects from class A and at the same time replaces all indirect accesses, from A to fields of B, by direct accesses. To the best of our knowledge, in-spite of the significant performance potential of the OI optimization, reported performance measurements were relatively moderate. This is because an aggressive OI optimization requires complex analysis and code transformations to overcome problems like multiple references to the inlinable object, object references that escape their object scope, etc.  相似文献   

14.
The timestamp problem captures a fundamental aspect of asynchronous distributed computing. It allows processes to label events throughout the system with timestamps that provide information about the real-time ordering of those events. We consider the space complexity of wait-free implementations of timestamps from shared read-write registers in a system of n processes. We prove an lower bound on the number of registers required. If the timestamps are elements of a nowhere dense set, for example the integers, we prove a stronger, and tight, lower bound of n. However, if timestamps are not from a nowhere dense set, this bound can be beaten: we give an implementation that uses n − 1 (single-writer) registers. We also consider the special case of anonymous implementations, where processes are programmed identically and do not have unique identifiers. In contrast to the general case, we prove anonymous timestamp implementations require n registers. We also give an implementation to prove that this lower bound is tight. This is the first anonymous timestamp implementation that uses a finite number of registers.  相似文献   

15.
We study a simple technique, originally presented by Herlihy (ACM Trans. Program. Lang. Syst. 15(5):745–770, 1993), for executing concurrently, in a wait-free manner, blocks of code that have been programmed for sequential execution and require significant synchronization in order to be performed in parallel. We first present an implementation of this technique, called Sim, which employs a collect object. We describe a simple implementation of a collect object from a single shared object that supports atomic Add (or XOR) in addition to read; this implementation has step complexity O(1). By plugging in to Sim this implementation, Sim exhibits constant step complexity as well. This allows us to derive lower bounds on the step complexity of implementations of several shared objects, like Add, XOR, collect, and snapshot objects, from LL/SC objects. We then present a practical version of Sim, called PSim, which is implemented in a real shared-memory machine. From a theoretical perspective, PSim has worse step complexity than Sim, its theoretical analog; in practice though, we experimentally show that PSim is highly-efficient: it outperforms several state-of-the-art lock-based and lock-free synchronization techniques, and this given that it is wait-free, i.e. that it satisfies a stronger progress condition than all the algorithms that it outperforms. We have used PSim to get highly-efficient wait-free implementations of stacks and queues.  相似文献   

16.
Despite the large amount of Byzantine fault-tolerant algorithms for message-passing systems designed through the years, only recently algorithms for the coordination of processes subject to Byzantine failures using shared memory have appeared. This paper presents a new computing model in which shared memory objects are protected by fine-grained access policies, and a new shared memory object, the Policy-Enforced Augmented Tuple Space (PEATS). We show the benefits of this model by providing simple and efficient consensus algorithms. These algorithms are much simpler and requires less shared memory operations, using also less memory bits than previous algorithms based on ACLs and sticky bits. We also prove that PEATS objects are universal, i.e., that they can be used to implement any other shared memory object, and present lock-free and wait-free universal constructions.  相似文献   

17.
Anatomic snapshot memory object in shared memory systems enables a set of processes, calledscanners, to obtain a consistent picture of the shared memory while other processes, calledupdaters, keep updating memory locations concurrently. In this paper we present two conversion methods of snapshot implementations. Using the first conversion method we obtain a new snapshot implementation in which the scan operation has linear time complexity and the time complexity of the update operation becomes the sum of the time complexities of the original implementation. Applying the second conversion method yields similar results, where in this case the time complexity of the update protocol becomes linear. Although our conversion methods use unbounded space, their space complexity can be bounded using known techniques. One of the most intriguing open problems in distributed wait-free computing is the existence of a linear-time implementation of this object. Using our conversion methods and known constructions we obtain the following results:
  • ?Consider a system ofn processes, each an updater and a scanner. We present an implementation in which the time complexity of either the update or the scan operation is linear, while the time complexity of the second operation isO(n logn).
  • ?We present an implementation with linear time complexity when the number of either updaters or scanners isO(n/logn), wheren is the total number of processes.
  • ?We present an implementation with amortized linear time complexity when one of the protocols (either upate or scan) is executed significantly more often than the other protocol.
  •   相似文献   

    18.
    A naming protocol assigns unique names (keys) to every process out of a set of communicating processes. We construct a randomized wait-free naming protocol using wait-free atomic read/write registers (shared variables) as process intercommunication primitives. Each process has its own private register and can read all others. The addresses/names each one uses for the others are possibly different: Processes p and q address the register of process r in a way not known to each other. For processes and , the protocol uses a name space of size and running time (read/writes to shared bits) with probability at least , and overall expected running time. The protocol is based on the wait-free implementation of a novel -Test&SetOnce object that randomly and fast selects a winner from a set of q contenders with probability at least in the face of the strongest possible adaptive adversary. Received: September 1994 / Accepted: January 1998  相似文献   

    19.
    20.
    Hazard pointers: safe memory reclamation for lock-free objects   总被引:2,自引:0,他引:2  
    Lock-free objects offer significant performance and reliability advantages over conventional lock-based objects. However, the lack of an efficient portable lock-free method for the reclamation of the memory occupied by dynamic nodes removed from such objects is a major obstacle to their wide use in practice. We present hazard pointers, a memory management methodology that allows memory reclamation for arbitrary reuse. It is very efficient, as demonstrated by our experimental results. It is suitable for user-level applications - as well as system programs - without dependence on special kernel or scheduler support. It is wait-free. It requires only single-word reads and writes for memory access in its core operations. It allows reclaimed memory to be returned to the operating system. In addition, it offers a lock-free solution for the ABA problem using only practical single-word instructions. Our experimental results on a multiprocessor system show that the new methodology offers equal and, more often, significantly better performance than other memory management methods, in addition to its qualitative advantages regarding memory reclamation and independence of special hardware support. We also show that lock-free implementations of important object types, using hazard pointers, offer comparable performance to that of efficient lock-based implementations under no contention and no multiprogramming, and outperform them by significant margins under moderate multiprogramming and/or contention, in addition to guaranteeing continuous progress and availability, even in the presence of thread failures and arbitrary delays.  相似文献   

    设为首页 | 免责声明 | 关于勤云 | 加入收藏

    Copyright©北京勤云科技发展有限公司  京ICP备09084417号