期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Possible and Impossible Self-Stabilizing Digital Clock Synchronization in General Graphs

Dolev Shlomi 《Real-Time Systems》1997,12(1):95-107

We study digital clock synchronization for multiprocessor systems, where processors are triggered by a common clock pulse and communicate with others via shared memory.A self-stabilizing digital clock synchronization protocol for systems with a general communication graph is presented. The protocol can commence in an arbitrary non-consistent system state and converges to a legitimate state in which the clocks are synchronized and incremented by one in every subsequent pulse.To enhance the fault-tolerance of our protocol, we allow that during and following convergence processors may stop operating. Crash failures may partition the communication graph into several connected components. Our protocol synchronizes the clocks of the processors in every such connected component. For the case in which faulty processors can exhibit Byzantine behavior, we prove that there is no digital clock synchronization protocol that tolerates even one single faulty processor. 相似文献

2.

An adaptive collect algorithm with applications

Hagit Attiya Arie Fouren Eli Gafni 《Distributed Computing》2002,15(2):87-96

Summary. In a shared-memory distributed system, n independent asynchronous processes communicate by reading and writing to shared variables. An algorithm is adaptive (to total contention) if its step complexity depends only on the actual number, k, of active processes in the execution; this number is unknown in advance and may change in different executions of the algorithm. Adaptive algorithms are inherently wait-free, providing fault-tolerance in the presence of an arbitrary number of crash failures and different processes' speed. A wait-free adaptive collect algorithm with O(k) step complexity is presented, together with its applications in wait-free adaptive alg orithms for atomic snapshots, immediate snapshots and renaming. Received: August 1999 / Accepted: August 2001 相似文献

3.

When consensus meets self-stabilization

Shlomi Dolev Ronen I. Kat Elad M. Schiller 《Journal of Computer and System Sciences》2010,76(8):884-900

This paper presents a shared-memory self-stabilizing failure detector, asynchronous consensus and replicated state-machine algorithm suite, the components of which can be started in an arbitrary state and converge to act as a virtual state-machine. Self-stabilizing algorithms can cope with transient faults. Transient faults can alter the system state to an arbitrary state and hence, cause a temporary violation of the safety property of the consensus. Started in an arbitrary state, the long lived, memory bounded and self-stabilizing failure detector, asynchronous consensus, and replicated state-machine suite, presented in the paper, recovers to satisfy eventual safety and eventual liveness requirements. Several new techniques and paradigms are introduced. The bounded memory failure detector abstracts away synchronization assumptions using bounded heartbeat counters combined with a balance–unbalance mechanism. The practically infinite paradigm is introduced in the scope of self-stabilization, where an execution of, say, 2⁶⁴ sequential steps is regarded as (practically) infinite. Finally, we present the first self-stabilizing wait-free reset mechanism that ensures eventual safety and can be used to implement efficient self-stabilizing timestamps that are of independent interest. 相似文献

4.

Randomization adaptive self-stabilization

Shlomi Dolev Nir Tzachar 《Acta Informatica》2010,47(5-6):313-323

We present a scheme to convert self-stabilizing algorithms that use randomization during and following convergence to self-stabilizing algorithms that use randomization only during convergence. We thus reduce the number of random bits from an infinite number to an expected bounded number. The scheme is applicable to the cases in which there exits a local predicate for each node, such that global consistency is implied by the union of the local predicates. We demonstrate our scheme over the token circulation algorithm of Herman (Infor Process Lett 35:63–67, 1990) and the recent constant time Byzantine self-stabilizing clock synchronization algorithm by Ben-Or, Dolev and Hoch (Proceedings of the 27th Annual ACM SIGACT-SIGOPS symposium on principles of distributed computing, (PODC), 2008). The application of our scheme results in the first constant time Byzantine self-stabilizing clock synchronization algorithm that eventually stops using random bits. 相似文献

5.

A Wait-Free Sorting Algorithm

N. Shavit E. Upfal A. Zemach 《Theory of Computing Systems》2001,34(6):519-544

Sorting is one of a set of fundamental problems in computer science. In this paper we present the first wait-free algorithm for sorting an input array of size N using P ≤ N processors to achieve optimal running time. We show two variants of the algorithm, one deterministic and one randomized, and prove that, with high probability, the latter suffers no more than contention when run synchronously. Known sorting algorithms, when made wait-free through previously established transformation techniques, have complexity O(log ³ N) . The algorithm we present here, when run in the CRCW PRAM model, executes with high probability in O(log N) time when P=N , and O((Nlog N)/P) otherwise, which is optimal amongst comparison-based sorting algorithms. The wait-free property guarantees that the sort will complete despite any delays or failures incurred by the processors. This is a very desirable property from an operating systems point of view, since it allows oblivious thread scheduling as well as thread creation and deletion, without fear of losing the algorithm's correctness. Received May 15, 1998, and in revised form November 17, 1999. Online publication November 19, 2001. 相似文献

6.

Optimal Adaptive Broadcasting with a Bounded Fraction of Faulty Nodes

K. Diks A. Pelc 《Algorithmica》2000,28(1):37-50

We consider broadcasting among n processors, f of which can be faulty. A fault-free processor, called the source, holds a piece of information which has to be transmitted to all other fault-free processors. We assume that the fraction f/n of faulty processors is bounded by a constant γ<1 . Transmissions are fault free. Faults are assumed to be of the crash type: faulty processors do not send or receive messages. We use the whispering model: pairs of processors communicating in one round must form a matching. A fault-free processor sending a message to another processor becomes aware of whether this processor is faulty or fault free and can adapt future transmissions accordingly. The main result of the paper is a broadcasting algorithm working in O( log n) rounds and using O(n) messages of logarithmic size, in the worst case. This is an improvement of the result from [17] where O ((log n) ² ) rounds were used. Our method also gives the first algorithm for adaptive distributed fault diagnosis in O( log n) rounds. Received May 1997; revised May 1998. 相似文献

7.

Using local-spin k-exclusion algorithms to improve wait-free object implementations

James H. Anderson Mark Moir 《Distributed Computing》1997,11(1):1-20

Summary. We present the first shared-memory algorithms for k-exclusion in which all process blocking is achieved through the use of “local-spin” busy waiting. Such algorithms are designed to reduce interconnect traffic, which is important for good performance. Our k-exclusion algorithms are starvation-free, and are designed to be fast in the absence of contention, and to exhibit scalable performance as contention rises. In contrast, all previous starvation-free k-exclusion algorithms require unrealistic operations or generate excessive interconnect traffic under contention. We also show that efficient, starvation-free k-exclusion algorithms can be used to reduce the time and space overhead associated with existing wait-free shared object implementations, while still providing some resilience to delays and failures. The resulting “hybrid” object implementations combine the advantages of local-spin spin locks, which perform well in the absence of process delays (caused, for example, by preemptions), and wait-free algorithms, which effectively tolerate such delays. We present performance results that confirm that this k-exclusion-based technique can improve the performance of existing wait-free shared object implementations. These results also show that lock-based implementations can be susceptible to severe performance degradation under multiprogramming, while our hybrid implementations are not. Received: December 1995 / Accepted: February 1997 相似文献

8.

Dynamic processor allocation in scalable multiprocessors using boolean algebra *

《国际计算机数学杂志》2012,89(3-4):333-358

In this paper, we study a new model for dynamic processor allocation in k-ary n-dimensional mesh or torus multiprocessors. The model uses Boolean functions to represent free processors and allocates processors by applying Boolean operations on the functions. The processor allocation algorithms based on the Boolean model can be implemented easily using binary decision diagrams(BDDs)and related software packages. To enhance the efficiency of the allocation algorithms, a reordering procedure will be introduced to change the ordering of Boolean variables in the BDD representation and thereby change the free subcube composition. Such a change leads to an improved free processor recognition capability. Complexities of the proposed allocation algorithms will be analyzed. Performance of the algorithms will be evaluated using simulation and compared with other approaches. 相似文献

9.

The BG distributed simulation algorithm

E. Borowsky E. Gafni N. Lynch S. Rajsbaum 《Distributed Computing》2001,14(3):127-146

We present a shared memory algorithm that allows a set of f+1 processes to wait-free “simulate” a larger system of n processes, that may also exhibit up to f stopping failures. Applying this simulation algorithm to the k-set-agreement problem enables conversion of an arbitrary k-fault-tolerant{\it n}-process solution for the k-set-agreement problem into a wait-free k+1-process solution for the same problem. Since the k+1-processk-set-agreement problem has been shown to have no wait-free solution [5,18,26], this transformation implies that there is no k-fault-tolerant solution to the n-process k-set-agreement problem, for any n. More generally, the algorithm satisfies the requirements of a fault-tolerant distributed simulation.\/ The distributed simulation implements a notion of fault-tolerant reducibility\/ between decision problems. This paper defines these notions and gives examples of their application to fundamental distributed computing problems. The algorithm is presented and verified in terms of I/O automata. The presentation has a great deal of interesting modularity, expressed by I/O automaton composition and both forward and backward simulation relations. Composition is used to include a safe agreement\/ module as a subroutine. Forward and backward simulation relations are used to view the algorithm as implementing a multi-try snapshot\/ strategy. The main algorithm works in snapshot shared memory systems; a simple modification of the algorithm that works in read/write shared memory systems is also presented. Received: February 2001 / Accepted: February 2001 相似文献

10.

SELF-STABILIZING DISTRIBUTED SORTING IN TREE NETWORKS

《International Journal of Parallel, Emergent and Distributed Systems》2012,27(1):1-15

This paper presents a self-stabilizing distributed sorting algorithm for tree networks. The distributed sorting problem can be informally described as follows: Nodes cooperate to reach a global configuration where every node, depending on its identifier, is assigned a specific final value taken from a set of input values distributed across all nodes. The input values may change in time. In our solution, the system reaches its final configuration in a finite time after the input values are stable and the faults cease. The fault-tolerance and the adaptivity to changing input is achieved using Dijkstra's paradigm of self-stabilization. A self-stabilizing algorithm, regardless of the initial system state, will converge in finite time to a set of legitimate states without the need for explicit exception handlers or backward recovery. Our solution is based on a continuous broadcast with acknowledgment along the tree edges to achieve the synchronization among processes in the system. It has 0(n ×h) time complexity and only 0(log(n) × ) memory requirement where h is the degree of the tree and h is the height of the tree. 相似文献

11.

A Randomized Algorithm for the Voronoi Diagram of Line Segments on Coarse-Grained Multiprocessors

Xiaotie Deng Binhai Zhu 《Algorithmica》1999,24(3-4):270-286

We present a randomized algorithm for computing the Voronoi diagram of line segments using coarse-grained parallel machines. Operating on P processors, for any input of n line segments, this algorithm performs O((n log n)/P) local operations per processor, O(n/P) messages per processor, and O(1) communication phases, with high probability for n=Ω(P ^3+ε ) . Received June 1, 1997; revised March 10, 1998. 相似文献

12.

Implementing the weakest failure detector for solving the consensus problem

《International Journal of Parallel, Emergent and Distributed Systems》2013,28(6):537-555

The concept of unreliable failure detector was introduced by Chandra and Toueg as a mechanism that provides information about process failures. This mechanism has been used to solve several agreement problems, such as the consensus problem. In this paper, algorithms that implement failure detectors in partially synchronous systems are presented. First two simple algorithms of the weakest class to solve the consensus problem, namely the Eventually Strong class (?S), are presented. While the first algorithm is wait-free, the second algorithm is f-resilient, where f is a known upper bound on the number of faulty processes. Both algorithms guarantee that, eventually, all the correct processes agree permanently on a common correct process, i.e. they also implement a failure detector of the class Omega (Ω). They are also shown to be optimal in terms of the number of communication links used forever. Additionally, a wait-free algorithm that implements a failure detector of the Eventually Perfect class (?P) is presented. This algorithm is shown to be optimal in terms of the number of bidirectional links used forever. 相似文献

13.

Time and Space Optimal Data Parallel Volume Rendering Using Permutation Warping

Craig M. Wittenbrink Arun K. Somani 《Journal of Parallel and Distributed Computing》1997,46(2):53

In this paper we present a data parallel volume rendering algorithm that possesses numerous advantages over prior published solutions. Volume rendering is a three-dimensional graphics rendering algorithm that computes views of sampled medical and simulation data, but has been much slower than other graphics algorithms because of the data set sizes and the computational complexity. Our algorithm usespermutation warpingto achieve linear speedup (run time is O(S/P) forPprocessors whenP\= O(S/logS) forS\=n³samples), linear storage (O(S)) for large data sets, arbitrary view directions, and high-quality filters. We derived a new processor permutation assignment of five passes (our prior known solution was eight passes), and a new parallel compositing technique that is essential for scaling linearly on machines that have more processors than view rays to process (P>n²). We show a speedup of 15.7 for a 16k processor over a 1k processor MasPar MP-1 (16 is linear) and two frames/second with a 128³volume and trilinear view reconstruction. In addition, we demonstrate volume sizes of 256³, constant run time over angles 5 to 75°, filter quality comparisons, and communication congestion of just 19 to 29\%. 相似文献

14.

Secure and self-stabilizing clock synchronization in sensor networks

Jaap-Henk Hoepman Andreas Larsson Philippas Tsigas 《Theoretical computer science》2011,412(40):5631-5647

In sensor networks, correct clocks have arbitrary starting offsets and nondeterministic fluctuating skews. We consider an adversary that aims at tampering with the clock synchronization by intercepting messages, replaying intercepted messages (after the adversary’s choice of delay), and capturing nodes (i.e., revealing their secret keys and impersonating them). We present an efficient clock sampling algorithm which tolerates attacks by this adversary, collisions, a bounded amount of losses due to ambient noise, and a bounded number of captured nodes that can jam, intercept, and send fake messages. The algorithm is self-stabilizing, so if these bounds are temporarily violated, the system can efficiently stabilize back to a correct state. Using this clock sampling algorithm, we construct the first self-stabilizing algorithm for secure clock synchronization in sensor networks that is resilient to the aforementioned adversarial attacks. 相似文献

15.

A clock synchronization algorithm for the performance analysis of multicomputer systems

Giuseppe De Pietro Umberto Villano 《Concurrency and Computation》1994,6(8):653-671

The paper deals with the implementation of global time in multicomputer systems. After a formalization of the synchronization problem, techniques to estimate the synchronization delay and to compensate the drift error are proposed. Then SYNC_WAVE, a clock synchronization algorithm where the values of a reference clock are diffused in a wave-like manner, is described. SYNC_WAVE has no provision for fault-tolerance and is specially designed to introduce low CPU and communication overhead, in order to support performance analysis applications efficiently. An implementation of the devised algorithm in a transputer-based system is presented, showing the accuracy results obtained. Finally SYNC_WAVE is compared to other synchronization algorithms and several of its possible applications are suggested. 相似文献

16.

Fault-Tolerant Matrix Operations for Networks of Workstations Using Diskless Checkpointing

James S. Plank Youngbae Kim Jack J. Dongarra 《Journal of Parallel and Distributed Computing》1997,43(2):427

Networks of workstations (NOWs) offer a cost-effective platform for high-performance, long-running parallel computations. However, these computations must be able to tolerate the changing and often faulty nature of NOW environments. We present high-performance implementations of several fault-tolerant algorithms for distributed scientific computing. The fault-tolerance is based on diskless checkpointing, a paradigm that uses processor redundancy rather than stable storage as the fault-tolerant medium. These algorithms are able to run on clusters of workstations that change over time due to failure, load, or availability. As long as there are at leastnprocessors in the cluster, and failures occur singly, the computation will complete in an efficient manner. We discuss the details of how the algorithms are tuned for fault-tolerance and present the performance results on a PVM network of Sun workstations connected by a fast, switched ethernet. 相似文献

17.

Efficient algorithms for system diagnosis with both processor andcomparator faults

Chen Y. Bucken W. Echtle K. 《Parallel and Distributed Systems, IEEE Transactions on》1993,4(4):371-381

For the comparison-based self-diagnosis of multiprocessor systems, an extended model that considers both processor and comparator faults is presented. It is shown that in this model the system diagnosability is t⩽Zδ/2Z, where δ is the minimum vertex degree of the system graph. However, if the number of faulty comparators is assumed not to exceed the number of faulty processors, the diagnosability of the model reaches t⩽δ. An optimal O(|E|) algorithm, where E is the set of comparators, is given for identifying all faulty processors and comparators, provided that the total number of faulty components does not exceed the system diagnosability, and an O(|E|)² algorithm for the case t⩽δ is also presented. These efficient algorithms determine the faulty processors by calculating each processor's weight, which is mainly defined by the number of adjacent relative tests stating `agreement'. After sorting the processors according to their weights, the algorithms determine all faulty components by separating the sorted processor list 相似文献

18.

Approximate algorithms for the knapsack problem on parallel computers

P. S. Gopalakrishnan I. V. Ramakrishnan L. N. Kanal 《Information and Computation》1991,91(2)

Computing an optimal solution to the knapsack problem is known to be NP-hard. Consequently, fast parallel algorithms for finding such a solution without using an exponential number of processors appear unlikely. An attractive alternative is to compute an approximate solution to this problem rapidly using a polynomial number of processors. In this paper, we present an efficient parallel algorithm for finding approximate solutions to the 0–1 knapsack problem. Our algorithm takes an , 0 < < 1, as a parameter and computes a solution such that the ratio of its deviation from the optimal solution is at most a fraction of the optimal solution. For a problem instance having n items, this computation uses O(n^5/2/^3/2) processors and requires O(log³n + log²nlog(1/)) time. The upper bound on the processor requirement of our algorithm is established by reducing it to a problem on weighted bipartite graphs. This processor complexity is a significant improvement over that of other known parallel algorithms for this problem. 相似文献

19.

Efficient parallel solutions to some geometric problems

Mikhail J. Atallah Michael T. Goodrich 《Journal of Parallel and Distributed Computing》1986,3(4)

This paper presents new algorithms for solving some geometric problems on a shared memory parallel computer, where concurrent reads are allowed but no two processors can simultaneously attempt to write in the same memory location. The algorithms are quite different from known sequential algorithms, and are based on the use of a new parallel divide-and-conquer technique. One of our results is an O(log n) time, O(n) processor algorithm for the convex hull problem. Another result is an O(log n log log n) time, O(n) processor algorithm for the problem of selecting a closest pair of points among n input points. 相似文献

20.

Parallel Dictionaries Using AVL Trees

《Journal of Parallel and Distributed Computing》1998,49(1):146-155

AVL (Adel'son-Vel'skii and Landis) trees are efficient data structures for implementing dictionaries. We present a parallel dictionary, using AVL trees, on the EREW PRAM by proposing optimal algorithms to performkoperations withp(1 ≤p≤k) processors. An explicit processor scheduling is devised to avoid simultaneous reads in our parallel algorithm to performksearches, which avoids the need for any additional memory in the parallelization. To perform multiple insertions and deletions, we identify rotations (in addition to AVL tree rotations) required to restore balance and present parallel algorithms to performpinsertions/deletions inO(logn+ logp) time withpprocessors. 相似文献