期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A Randomized Algorithm for the Voronoi Diagram of Line Segments on Coarse-Grained Multiprocessors

Xiaotie Deng Binhai Zhu 《Algorithmica》1999,24(3-4):270-286

We present a randomized algorithm for computing the Voronoi diagram of line segments using coarse-grained parallel machines. Operating on P processors, for any input of n line segments, this algorithm performs O((n log n)/P) local operations per processor, O(n/P) messages per processor, and O(1) communication phases, with high probability for n=Ω(P ^3+ε ) . Received June 1, 1997; revised March 10, 1998. 相似文献

2.

A new solution for the Byzantine agreement problem

Hui-Ching HsiehAuthor Vitae 《Journal of Parallel and Distributed Computing》2011,71(10):1261-1277

Reliability is an important research topic in distributed computing systems consisting of a large number of processors. To achieve reliability, the fault-tolerance scheme of the distributed computing system must be revised. This kind of problem is known as a Byzantine agreement (BA) problem. It requires all fault-free processors to agree on a common value, even if some components are corrupt. Consequently, there have been significant studies of this agreement problem in distributed systems. However, the traditional BA protocols focus on running ⌊(n−1)/3⌋+1 rounds of message exchange continuously to make each fault-free processor reach an agreement. In other words, since having a large number of messages results in a large protocol overhead, those protocols are inefficient and unreasonable, especially for some network environments which have large number of processors. In this study, we propose a novel and efficient protocol to reduce the number of messages. Our protocol can collect, compare and replace the received values to find the reliable processors and replace the values sent by the unreliable processors. Subsequently, each processor can agree on a common value through three rounds of message exchange. Furthermore, the proposed protocol can use the minimum number of messages to tolerate the maximum number of faulty components in a distributed system. 相似文献

3.

Optimal asynchronous agreement and leader election algorithm for complete networks with Byzantine faulty links

Hasan M. Sayeed Marwan Abu-Amara Hosame Abu-Amara 《Distributed Computing》1995,9(3):147-156

Summary. We consider agreement and leader election on asynchronous complete networks when the processors are reliable, but some of the channels are subject to failure. Fischer, Lynch, and Paterson have already shown that no deterministic algorithm can solve the agreement problem on asynchronous networks if any processor fails during the execution of the algorithm. Therefore, we consider only channel failures. The type of channel failure we consider in this paper is Byzantine failure, that is, channels fail by altering messages, sending false information, forging messages, losing messages at will, and so on. There are no restrictions on the behavior of a faulty channel. Therefore, a faulty channel may act as an adversary who forges messages on purpose to prevent the successful completion of the algorithm. Because we assume an asynchronous network, the channel delays are arbitrary. Thus, the faulty channels may not be detectable unless, for example, the faulty channels cause garbage to be sent. We present the first known agreement and leader election algorithm for asynchronous complete networks in which the processors are reliable but some channels may be Byzantine faulty. The algorithm can tolerate up to [n−22] faulty channels, where n is the number of processors in the network. We show that the bound on the number of faulty channels is optimal. When the processors terminate their corresponding algorithms, all the processors in the network will have the same correct vector, where the vector contains the private values of all the processors. Received: May 1994/Accepted: July 1995 相似文献

4.

The incremental agreement

M.L. Chiang L.Y. Tseng 《Information Processing Letters》2008,107(5):165-170

To achieve reliable distributed systems, the fault-tolerance must be studied. One of the most important problems of fault-tolerance issues lies in the Byzantine Agreement (BA) problem. The primary issue surrounding BA is that fault-free processors must obtain common agreement even in cases where faults persist. In this field, the fault diagnosis protocol has been proposed so that each fault-free processor detects/locates a common set of faulty processors. However, in this study, the incremental agreement is invoked to make each processor able to agreement upon executing the fault diagnosis protocol using minimal rounds of message exchange in the presence of dual failure characteristics of processors. 相似文献

5.

Lower bound for scalable Byzantine Agreement

Dan Holtby Bruce M. Kapron Valerie King 《Distributed Computing》2008,21(4):239-248

We consider the problem of computing Byzantine Agreement in a synchronous network with n processors, each with a private random string, where each pair of processors is connected by a private communication line. The adversary is malicious and non-adaptive, i.e., it must choose the processors to corrupt at the start of the algorithm. Byzantine Agreement is known to be computable in this model in an expected constant number of rounds. We consider a scalable model where in each round each uncorrupt processor can send to any set of log n other processors and listen to any set of log n processors. We define the loss of an execution to be the number of uncorrupt processors whose output does not agree with the output of the majority of uncorrupt processors. We show that if there are t corrupt processors, then any randomised protocol which has probability at least 1/2 + 1/ logn of loss less than requires at least f rounds. This also shows that lossless protocols require both rounds, and for at least one uncorrupt processor to send messages during the protocol. 相似文献

6.

Gossiping by processors prone to omission failures

Dariusz R. Kowalski 《Information Processing Letters》2009,109(6):308-314

We consider the gossip problem in a synchronous message-passing system. Participating processors are prone to omission failures, that is, a faulty processor may fail to send or receive a message. The gossip problem in the fault-tolerant setting is defined as follows: every correct processor must learn the initial value of any other processor, unless the other one is faulty; in the latter case either the initial value or the information about the fault must be learned. We develop two efficient algorithms that solve the gossip problem in time O(logn), where n is the number of processors in the system. The first one is an explicit algorithm (i.e., constructed in polynomial time) sending O(nlogn+f²) messages, and the second one reduces the message complexity to O(n+f²), where f is the upper bound on the number of faulty processors. 相似文献

7.

Broadcasting multiple messages in the multiport model

Bar-Noy A. Ching-Tien Ho 《Parallel and Distributed Systems, IEEE Transactions on》1999,10(5):500-508

We consider the problem of broadcasting multiple messages from one processor to many processors in the k-port model for message-passing systems. In such systems, processors communicate in rounds, where in every round, each processor can send k messages to k processors and receive k messages from k processors In this paper, we first present a simple and practical algorithm based on variations of k complete k-ary trees. We then present an optimal algorithm up to an additive term of one for this problem for any number of processors, any number of messages, and any value for k 相似文献

8.

Parallel Algorithms for Maximum Matching in Complements of Interval Graphs and Related Problems

M. G. Andrews M. J. Atallah D. Z. Chen D. T. Lee 《Algorithmica》2000,26(2):263-289

Given a set of n intervals representing an interval graph, the problem of finding a maximum matching between pairs of disjoint (nonintersecting) intervals has been considered in the sequential model. In this paper we present parallel algorithms for computing maximum cardinality matchings among pairs of disjoint intervals in interval graphs in the EREW PRAM and hypercube models. For the general case of the problem, our algorithms compute a maximum matching in O( log ³ n) time using O(n/ log² n) processors on the EREW PRAM and using n processors on the hypercubes. For the case of proper interval graphs, our algorithm runs in O( log n ) time using O(n) processors if the input intervals are not given already sorted and using O(n/ log n ) processors otherwise, on the EREW PRAM. On n -processor hypercubes, our algorithm for the proper interval case takes O( log n log log n ) time for unsorted input and O( log n ) time for sorted input. Our parallel results also lead to optimal sequential algorithms for computing maximum matchings among disjoint intervals. In addition, we present an improved parallel algorithm for maximum matching between overlapping intervals in proper interval graphs. Received November 20, 1995; revised September 3, 1998. 相似文献

9.

A Note on Parallel Selection on Coarse-Grained Multicomputers 总被引：1，自引：0，他引：1

E. L. G. Saukas S. W. Song 《Algorithmica》1999,24(3-4):371-380

Consider the selection problem of determining the k th smallest element of a set of n elements. Under the CGM (coarse-grained multicomputer) model with p processors and O(n/p) local memory, we present a deterministic parallel algorithm for the selection problem that requires O( log p) communication rounds. Besides requiring a low number of communication rounds, the algorithm also attempts to minimize the total amount of data transmitted in each round (only O(p) except in the last round). In addition to showing theoretical complexities, we present very promising experimental results obtained on a parallel machine that show almost linear speedup, indicating the efficiency and scalability of the proposed algorithm. Received June 1, 1997; revised March 10, 1998. 相似文献

10.

A fast pessimistic diagnosis algorithm for generalized hypercube multicomputer systems

Dyi-Rong Duh Chien-Hong Chen Keh-Ning Chang 《The Journal of supercomputing》2012,61(3):605-618

The reliability of processors is an important issue for designing a massively parallel processing system for which fault-tolerant computing is crucial. In order to achieve high system reliability and availability, a faulty processor (node) when found should be replaced by a fault-free processor. Within a multiprocessor system, the technique of identifying faulty nodes by constructing tests on the nodes and interpreting the test outcomes is known as system-level diagnosis. The topological structure of a multicomputer system can be modeled by a graph of which the vertices and edges correspond to nodes and links of the system, respectively. This work presents a system-level diagnosis algorithm for a generalized hypercube which is an attractive variance of a hypercube. The proposed algorithm is based on the PMC model and can isolate all faulty nodes to within a set which contains at most one fault-free node. If the total number of nodes to be diagnosed in a generalized hypercube is N, the proposed algorithm can run in O(Nlog?N) time, and being superior to Yang??s algorithm proposed in 2004, it can diagnose not only a hypercube but also a generalized hypercube. 相似文献

11.

Fast Generation of Random Permutations Via Networks Simulation

A. Czumaj P. Kanarek M. Kutylowski K. Lorys 《Algorithmica》1998,21(1):2-20

We consider the problem of generating random permutations with uniform distribution. That is, we require that for an arbitrary permutation π of n elements, with probability 1/n! the machine halts with the i th output cell containing π(i) , for 1 ≤ i ≤ n . We study this problem on two models of parallel computations: the CREW PRAM and the EREW PRAM. The main result of the paper is an algorithm for generating random permutations that runs in O(log log n) time and uses O(n ^1+o(1) ) processors on the CREW PRAM. This is the first o(log n) -time CREW PRAM algorithm for this problem. On the EREW PRAM we present a simple algorithm that generates a random permutation in time O(log n) using n processors and O(n) space. This algorithm outperforms each of the previously known algorithms for the exclusive write PRAMs. The common and novel feature of both our algorithms is first to design a suitable random switching network generating a permutation and then to simulate this network on the PRAM model in a fast way. Received November 1996; revised March 1997. 相似文献

12.

Optimal Parallel Randomized Algorithms for the Voronoi Diagram of Line Segments in the Plane

Rajasekaran Ramaswami 《Algorithmica》2002,33(4):436-460

Abstract. We present an optimal parallel randomized algorithm for the Voronoi diagram of a set of n nonintersecting (except possibly at endpoints) line segments in the plane. Our algorithm runs in O(log n) time with high probability using O(n) processors on a CRCW PRAM. This algorithm is optimal in terms of work done since the sequential time bound for this problem is Ω(n log n) . Our algorithm improves by an O(log n) factor the previously best known deterministic parallel algorithm, given by Goodrich, ó'Dúnlaing, and Yap, which runs in O( log ² n) time using O(n) processors. We obtain this result by using a new ``two-stage' random sampling technique. By choosing large samples in the first stage of the algorithm, we avoid the hurdle of problem-size ``blow-up' that is typical in recursive parallel geometric algorithms. We combine the two-stage sampling technique with efficient search and merge procedures to obtain an optimal algorithm. This technique gives an alternative optimal algorithm for the Voronoi diagram of points as well (all other optimal parallel algorithms for this problem use the transformation to three-dimensional half-space intersection). 相似文献

13.

Efficient Parallel Computation of the Characteristic Polynomial of a Sparse, Separable Matrix

J. H. Reif 《Algorithmica》2001,29(3):487-510

{This paper is concerned with the problem of computing the characteristic polynomial of a matrix. In a large number of applications, the matrices are symmetric and sparse : with O(n) non-zero entries. The problem has an efficient sequential solution in this case, requiring O(n ² ) work by use of the sparse Lanczos method. A major remaining open question is: to find a polylog time parallel algorithm with matching work bounds. Unfortunately, the sparse Lanczos method cannot be parallelized to faster than time Ω (n) using n processors. Let M(n) be the processor bound to multiply two n \times n matrices in O(log n) parallel time. Giesbrecht [G2] gave the best previous polylog time parallel algorithms for the characteristic polynomial of a dense matrix with O (M(n)) processors. There is no known improvement to this processor bound in the case where the matrix is sparse. Often, in addition to being symmetric and sparse, the matrix has a sparsity graph (which has edges between indices of the matrix with non-zero entries) that has small separators. This paper gives a new algorithm for computing the characteristic polynomial of a sparse symmetric matrix, assuming that the sparsity graph is s(n) -separable and has a separator of size s(n)=O(n ^γ ) , for some γ , 0 < γ < 1 , that when deleted results in connected components of ≤α n vertices, for some 0 < α < 1 , with the same property. We derive an interesting algebraic version of Nested Dissection, which constructs a sparse factorization of the matrix A-λ I _n where A is the input matrix and I _n is the n \times n identity matrix. While Nested Dissection is commonly used to minimize the fill-in in the solution of sparse linear systems, our innovation is to use the separator structure to bound also the work for manipulation of rational functions in the recursively factored matrices. The matrix elements are assumed to be over an arbitrary field. We compute the characteristic polynomial of a sparse symmetric matrix in polylog time using P(n)(n+M(s(n))) ≤ P(n)(n+ s(n) ^2.376 ) processors, where P(n) is the processor bound to multiply two degree n polynomials in O(log n) parallel time using a PRAM (P(n) = O(n) if the field supports an FFT of size n but is otherwise O(nlog log n) [CK]. Our method requires only that a matrix be symmetric and non-singular (it need not be positive definite as usual for Nested Dissection techniques). For the frequently occurring case where the matrix has small separator size, our polylog parallel algorithm has work bounds competitive with the best known sequential algorithms (i.e., the Ω(n ² ) work of sparse Lanczos methods), for example, when the sparsity graph is a planar graph, s(n) ≤ O( \sqrt n ) , and we require polylog time with only P(n)n ^1.188 processors. } Received September 26, 1997; revised June 5, 1999. 相似文献

14.

Pipelined Diagnosis of Wafer-Scale Linear Arrays

《Journal of Parallel and Distributed Computing》1994,20(2):212-223

We present a comparison-based algorithm for identifying faulty and fault-free elements in a wafer-scale linear array of processors (or other logic elements). Only nearest neighbor communication is assumed to be possible between the processors in the array. Because the algorithm is simple and requires no storage of test vectors or test outcomes, it is ideally suited for implementation on the wafer to provide the capability for built-in production (or post production) testing. We show that, surprisingly, this method achieves high accuracy of diagnosis over a wide range of yields even though the diagnosis may be based on a high proportion of results produced by faulty processors. We also propose an improvement to the above algorithm which uses a processor diagnosed as fault-free by the basic algorithm as the starting point in improving the accuracy with which faulty processors are identified. Quantitative and qualitative reasoning validate the efficiency of these schemes. 相似文献

15.

Eventual strong consensus with fault detection in the presence of dual failure mode on processors under dynamic networks

《Journal of Network and Computer Applications》2012,35(4):1260-1276

The fault tolerance capability and reliability of a distributed system can be enhanced if the Strong Consensus (SC) problem can be properly addressed. Most of the extant SC protocols are designed for static networks. Besides, the number of rounds of message exchange required by all of the extant SC protocols is determined by the total number of processors in the network rather than by the actual number of faulty processors in the network. Even if there is only a few or no faulty processor in the network, the SC protocols may waste a lot of time and memory space on many unnecessary rounds of message exchange. Thus, this paper revisits the SC problem in dynamic networks and uses two rules, Detection Rule for Malicious fault in dynamic network (DRM_dyn) and Early Stopping Rule for Strong Consensus protocol in dynamic networks (ESRSC_dyn), to reduce the time consumption and space complexity of SC protocols. DRM_dyn is a rule that detects malicious processors, and ESRSC_dyn is a rule that determines whether the messages collected are enough for reaching a strong consensus. To be succinct, the proposed SC protocol can not only work in dynamic networks consisting of both dormant processors and malicious processors (dual failure mode) but also ensure that all correct processors reach a SC value within fewer rounds of message exchange than required by the extant SC protocols. 相似文献

16.

Efficient Construction of Minimum Makespan Schedules for Tasks with a Fixed Number of Distinct Execution Times

D. J. Rosenkrantz L. Yu S. S. Ravi 《Algorithmica》2001,30(1):83-100

This paper addresses scheduling problems for tasks with release and execution times. We present a number of efficient and easy to implement algorithms for constructing schedules of minimum makespan when the number of distinct task execution times is fixed. For a set of independent tasks, our algorithm in the single processor case runs in time linear in the number of tasks; with precedence constraints, our algorithm runs in time linear in the sum of the number of tasks and the size of the precedence constraints. In the multi-processor case, our algorithm constructs minimum makespan schedules for independent tasks with uniform execution times. The algorithm runs in O(n log m) time where n is the number of tasks and m is the number of processors. Received September 25, 1997; revised June 11, 1998. 相似文献

17.

Minimum Assignment of Test Links for Hypercubes with Lower Fault Bounds

Dajin Wang Zhongxian Wang 《Journal of Parallel and Distributed Computing》1997,40(2):545

In ann-dimensional hypercube multiprocessor system, to correctly diagnose faulty processors among themselves, the maximum allowed number of faulty processors isnunder the well-known PMC diagnostic model. When thenfault bound is adopted, all links between processors will be used in the diagnosis. However, if the fault bound is lower thann, many links can be freed from the task of performing diagnosis. In this paper, we show that each drop of the fault bound by 1 will free 2^n-1links from diagnosis. We will present an algorithm that selects, in a symmetric manner, the to-be-freed links, so that only a minimum number of links will be used to perform diagnosis. A rigorous proof for the algorithm's correctness is given. The freed links will never be used for the purpose of diagnosis, so that the diagnosis and some conventional computations may be carried out simultaneously, improving the performance of the system as a whole. 相似文献

18.

Efficient Parallel Graph Algorithms for Coarse-Grained Multicomputers and BSP

F. Dehne A. Ferreira E. Cáceres S. W. Song A. Roncato 《Algorithmica》2002,33(2):183-200

In this paper we present deterministic parallel algorithms for the coarse-grained multicomputer (CGM) and bulk synchronous parallel (BSP) models for solving the following well-known graph problems: (1) list ranking, (2) Euler tour construction in a tree, (3) computing the connected components and spanning forest, (4) lowest common ancestor preprocessing, (5) tree contraction and expression tree evaluation, (6) computing an ear decomposition or open ear decomposition, and (7) 2-edge connectivity and biconnectivity (testing and component computation). The algorithms require O(log p) communication rounds with linear sequential work per round (p = no. processors, N = total input size). Each processor creates, during the entire algorithm, messages of total size O(log (p) (N/p)) . The algorithms assume that the local memory per processor (i.e., N/p ) is larger than p ^ε , for some fixed ε > 0 . Our results imply BSP algorithms with O(log p) supersteps, O(g log (p) (N/p)) communication time, and O(log (p) (N/p)) local computation time. It is important to observe that the number of communication rounds/ supersteps obtained in this paper is independent of the problem size, and grows only logarithmically with respect to p . With growing problem size, only the sizes of the messages grow but the total number of messages remains unchanged. Due to the considerable protocol overhead associated with each message transmission, this is an important property. The result for Problem (1) is a considerable improvement over those previously reported. The algorithms for Problems (2)—(7) are the first practically relevant parallel algorithms for these standard graph problems. Received July 5, 2000; revised April 16, 2001. 相似文献

19.

A Simple and Efficient Randomized Byzantine Agreement Algorithm

《IEEE transactions on pattern analysis and machine intelligence》1985,(6):531-539

A new randomized Byzantine agreement algorithm is presented. This algorithm operates in a synchronous system of n processors, at most t of which can fail. The algorithm reaches agreement in 0(t/log n) expected rounds and O(n2tf/log n) expected message bits independent of the distribution of processor failures. This performance is further improved to a constant expected number of rounds and O(n2) message bits if the distribution of processor failures is assumed to be uniform. In either event, the algorithm improves on the known lower bound on rounds for deterministic algorithms. Some other advantages of the algorithm are that it requires no cryptographic techniques, that the amount of local computation is small, and that the expected number of random bits used per processor is only one. It is argued that in many practical applications of Byzantine agreement, the randomized algorithm of this paper achieves superior performance. 相似文献

20.

Optimal Sublogarithmic Time Parallel Algorithms on Rooted Forests

G. Sajith S. Saxena 《Algorithmica》2000,27(2):187-197

The problem of finding a sublogarithmic time optimal parallel algorithm for 3 -colouring rooted forests has been open for long. We settle this problem by obtaining an O(( log log n) log^* ( log^* n)) time optimal parallel algorithm on a TOLERANT Concurrent Read Concurrent Write (CRCW) Parallel Random Access Machine (PRAM). Furthermore, we show that if f(n) is the running time of the best known algorithm for 3 -colouring a rooted forest on a COMMON or TOLERANT CRCW PRAM, a fractional independent set of the rooted forest can be found in O(f(n)) time with the same number of processors, on the same model. Using these results, it is shown that decomposable top-down algebraic computation and, hence, depth computation (ranking), 2 -colouring and prefix summation on rooted forests can be done in O( log n) optimal time on a TOLERANT CRCW PRAM. These algorithms have been obtained by proving a result of independent interest, one concerning the self-simulation property of TOLERANT: an N -processor TOLERANT CRCW PRAM that uses an address space of size O(N) only, can be simulated on an n -processor TOLERANT PRAM in O(N/n) time, with no asymptotic increase in space or cost, when n=O(N/ log log N) . Received May 20, 1997; revised June 15, 1998. 相似文献