首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 125 毫秒
1.
This paper presents an optimal parallel algorithm for triangulating an arbitrary set ofn points in the plane. The algorithm runs inO(logn) time usingO(n) space andO(n) processors on a Concurrent-Read, Exclusive-Write Parallel RAM model (CREW PRAM). The parallel lower bound on triangulation is (logn) time so the best possible linear speedup has been achieved. A parallel divide-and-conquer technique of subdividing a problem into subproblems is employed.  相似文献   

2.
We present a simple parallel algorithm for computing the greatest common divisor (gcd) of twon-bit integers in the Common version of the CRCW model of computation. The run-time of the algorithm in terms of bit operations isO(n/logn), usingn 1+ processors, where is any positive constant. This improves on the algorithm of Kannan, Miller, and Rudolph, the only sublinear algorithm known previously, both in run time and in number of processors; they requireO(n log logn/logn),n 2 log2 n, respectively, in the same CRCW model.We give an alternative implementation of our algorithm in the CREW model. Its run-time isO(n log logn/logn), usingn 1+ processors. Both implementations can be modified to yield the extended gcd, within the same complexity bounds.Supported in part by an IBM Graduate Fellowship and a Bantrell Postdoctoral Fellowship.Supported in part by a Weizmann Postdoctoral Fellowship.4 All logarithms are to base 2.  相似文献   

3.
We present an optimal parallel algorithm for computing a cycle separator of ann-vertex embedded planar undirected graph inO(logn) time onn/logn processors. As a consequence, we also obtain an improved parallel algorithm for constructing a depth-first search tree rooted at any given vertex in a connected planar undirected graph in O(log2 n) time on n/logn processors. The best previous algorithms for computing depth-first search trees and cycle separators achieved the same time complexities, but withn processors. Our algorithms run on a parallel random access machine that permits concurrent reads and concurrent writes in its shared memory and allows an arbitrary processor to succeed in case of a write conflict.A preliminary version of this paper appeared as Improved Parallel Depth-First Search in Undirected Planar Graphs in theProceedings of the Third Workshop on Algorithms and Data Structures, 1993, pp. 407–420.Supported in part by NSF Grant CCR-9101385.  相似文献   

4.
Let (G) denote the independence number of a graphG, that is the maximum number of pairwise independent vertices inG. We present a parallel algorithm that computes in a planar graphG = (V, E), an independent set such that ¦I¦ (G)/2. The algorithm runs in timeOlog2 n) and requires a linear number of processors. This is achieved by denning a new set of reductions that can be executed locally and simultaneously; furthermore, it is shown that a constant fraction of the vertices in the graph are reducible. This is the best known approximation scheme when the number of processors available is linear; parallel implementation of known sequential algorithms requires many more processors.Joseph Naor was supported by Contract ONR N00014-88-K-0166. Most of this work was done while he was a post-doctoral fellow at the Department of Computer Science, University of Southern California, Los Angeles, CA 90089-0782, USA.  相似文献   

5.
On parallel integer sorting   总被引:1,自引:0,他引:1  
We present an optimal algorithm for sortingn integers in the range [1,n c ] (for any constantc) for the EREW PRAM model where the word length isn , for any >0. Using this algorithm, the best known upper bound for integer sorting on the (O(logn) word length) EREW PRAM model is improved. In addition, a novel parallel range reduction algorithm which results in a near optimal randomized integer sorting algorthm is presented. For the case when the keys are uniformly distributed integers in an arbitrary range, we give an algorithm whose expected running time is optimal.Supported by NSF-DCR-85-03251 and ONR contract N00014-87-K-0310  相似文献   

6.
LetR be a unidirectional asynchronous ring ofn identical processors each with a single input bit. Letf be any cyclic nonconstant function ofn boolean variables. Moran and Warmuth (1986) prove that anydeterministic algorithm that evaluatesf onR has communication complexity (n logn) bits. They also construct a family of cyclic nonconstant boolean functions that can be evaluated inO(n logn) bits by a deterministic algorithm.This contrasts with the following new results:
1.  There exists a family of cyclic nonconstant boolean functions which can be evaluated with expected complexity bits by arandomized algorithm forR.
2.  Anynondeterministic algorithm forR which evaluates any cyclic nonconstant function has communication complexity bits.
  相似文献   

7.
The Interpolation-Based Bintree (IBB) is a storage-saving encodingscheme for representing binary images. In this paper, we presentefficient parallel algorithms for important manipulations on IBBcoded images (also called bincodes). Given a set of bincodes, e.g.,B with size n, the 4-neighborfinding and the diagonal-neighbor finding algorithms onB can be accomplished in O(1) time on an n x n mesh computer with multiple broadcasting(MMB). Given two sets of bincodes, B 1 and B 2, with size n and m n, respectively, the intersection and unionoperations for B 1 and B 2 can be performedin O(1) time on an MMB using processors. With n 2 processors, the complementoperation for B can be performed in O(logn) time.  相似文献   

8.
Given an array ofn input numbers, therange-maxima problem is that of preprocessing the data so that queries of the type what is the maximum value in subarray [i..j] can be answered quickly using one processor. We present a randomized preprocessing algorithm that runs inO(log* n) time with high probability, using an optimal number of processors on a CRCW PRAM; each query can be processed in constant time by one processor. We also present a randomized algorithm for a parallel comparison model. Using an optimal number of processors, the preprocessing algorithm runs inO( (n)) time with high probability; each query can be processed inO ( (n)) time by one processor. (As is standard, (n) is the inverse of Ackermann function.) A constant time query can be achieved by some slowdown in the performance of the preprocessing stage.  相似文献   

9.
A new algorithm for solving systems of linear equations Ax = b in an Euclidean domain is suggested. In the case of the ring of integers, the complexity of this algorithm is O (n 3 mlog2 ||A||), where n)$$ " align="middle" border="0"> is a matrix of rank n and , if standard algorithms for the multiplication of integers and matrices are used. Under the same conditions, the best algorithm of this kind among those published earlier, which was suggested by Labahn and Storjohann in [1], has complexity O (n 4 mlog2 ||A||). True, when using fast algorithms for the multiplication of numbers and matrices, the theoretical complexity estimate for the latter algorithm is O (n mlog2 ||A||), which is better than the similar estimate O (n 3 mlog||A||) for the new algorithm.  相似文献   

10.
We present parallel algorithms to construct binary trees with almost optimal weighted path length. Specifically, assuming that weights are normalized (to sum up to one) and error refers to the (absolute) difference between the weighted path length of a given tree and the optimal tree with the same weights, we present anO (logn)-time andn(log lognl logn)-EREW-processor algorithm which constructs a tree with error less than 0.18, andO (k logn log* n)-time andn-CREW-processor algorithm which produces a tree with error at most l/n k , and anO (k 2 logn)-time andn 2-CREW-processor algorithm which produces a tree with error at most l/n k . As well, we describe two sequential algorithms, anO(kn)-time algorithm which produces a tree with error at most l/n k , and anO(kn)-time algorithm which produces a tree with error at most . The last two algorithms use different computation models.The first author's research was supported in part by NSERC Research Grant 3053. A part of this work was done while the second author was at the University of British Columbia.  相似文献   

11.
This paper studies on-line scheduling in a single-processor system that allows preemption. The aim is to maximize the total value of jobs completed by their deadlines. It is known that if the on-line scheduler is given a processor faster (say, two times faster) than the off-line scheduler, then there exists an on-line algorithm called that can achieve an O(1) competitive ratio. In this paper, we show that using additional unit-speed processors instead of a faster processor is a possible but not cost effective way to achieve an O(1) competitive ratio. Specifically, we find that-(logk) unit-speed processors are required, where k is the importance ratio. Another contribution of this paper is an improved analysis of the competitiveness of ; this new analysis enables us to show that , when extended to multi-processor systems, can still guarantee an O(1) competitive ratio.  相似文献   

12.
We give the first efficient parallel algorithms for solving the arrangement problem. We give a deterministic algorithm for the CREW PRAM which runs in nearly optimal bounds ofO (logn log* n) time andn 2/logn processors. We generalize this to obtain anO (logn log* n)-time algorithm usingn d /logn processors for solving the problem ind dimensions. We also give a randomized algorithm for the EREW PRAM that constructs an arrangement ofn lines on-line, in which each insertion is done in optimalO (logn) time usingn/logn processors. Our algorithms develop new parallel data structures and new methods for traversing an arrangement.This work was supported by the National Science Foundation, under Grants CCR-8657562 and CCR-8858799, NSF/DARPA under Grant CCR-8907960, and Digital Equipment Corporation. A preliminary version of this paper appeared at the Second Annual ACM Symposium on Parallel Algorithms and Architectures [3].  相似文献   

13.
Previous research on developing parallel triangulation algorithms concentrated on triangulating planar point sets.O(log3 n) running time algorithms usingO(n) processors have been developed in Refs. 1 and 2. Atallah and Goodrich(3) presented a data structure that can be viewed as a parallel analogue of the sequential plane-sweeping paradigm, which can be used to triangulate a planar point set inO(logn loglogn) time usingO(n) processors. Recently Merks(4) described an algorithm for triangulating point sets which runs inO(logn) time usingO(n) processors, and is thus optimal. In this paper we develop a parallel algorithm for triangulating simplicial point sets in arbitrary dimensions based on the idea of the sequential algorithm presented in Ref. 5. The algorithm runs inO(log2 n) time usingO(n/logn) processors. The algorithm hasO(n logn) as the product of the running time and the number of processors; i.e., an optimal speed-up.  相似文献   

14.
In this paper we give a parallel algorithm for line-segment intersection reporting in the plane. It runs in timeO(((n +k) logn log logn)/p) usingp processors on a concurrent-read-exclusive-write (CREW)-PRAM, wheren is the number of line segments,k is the number of intersections, andp n +k.This work was supported by the DFG, SFB 124, TP B2, VLSI Entwurfsmethoden und Parallelität.  相似文献   

15.
The time complexity of searching a sorted list ofn elements in parallel on a coarse grained network of diameterD and consisting ofN processors (wheren may be much larger thanN) is studied. The worst case period and latency of a sequence of pipeline search operation are easity seen to be (logn–logN) and (D+logn–logN), respectively. Since forn=N 1+(1) the worst-case period is (logn) (which can be achieved by a single processor), coarse-grained networks appear to be unsuitable for the search problem. By contrast, it is demonstrated using standard queuing theory techniques that a constant expected period can be achieved provided thatn=O(N2 N ).This research was supported by the Natural Sciences and Engineering Research Council of Canada under Grants A3336 and A9173.  相似文献   

16.
Given two finite sets of points in a plane, the polygon separation problem is to construct a separating convexk-gon with smallestk. In this paper, we present a parallel algorithm for the polygon separation problem. The algorithm runs inO(logn) time on a CREW PRAM withn processors, wheren is the number of points in the two given sets. The algorithm is cost-optimal, since (n logn) is a lower-bound for the time needed by any sequential algorithm. We apply this algorithm to the problem of finding a convex polygon, with the minimal number of edges, for which a given convex region is its digital image. The algorithm in this paper constructs one such polygon with possibly two more edges than the minimal one.The research is sponsored by NSERC Operating Grant OGPIN 007.  相似文献   

17.
Parallel integer sorting using small operations   总被引:1,自引:0,他引:1  
We consider the problem of sortingn integers in the range [0,n c -1], wherec is a constant. It has been shown by Rajasekaran and Sen [14] that this problem can be solved optimally inO(logn) steps on an EREW PRAM withO(n) n -bit operations, for any constant >O. Though the number of operations is optimal, each operation is very large. In this paper, we show thatn integers in the range [0,n c -1] can be sorted inO(logn) time withO(nlogn)O(1)-bit operations andO(n) O(logn)-bit operations. The model used is a non-standard variant of an EREW PRAMtthat permits processors to have word-sizes ofO(1)-bits and (logn)-bits. Clearly, the speed of the proposed algorithm is optimal. Considering that the input to the problem consists ofO (n logn) bits, the proposed algorithm performs an optimal amount of work, measured at the bit level.This work was partially supported by The Northeast Parallel Architectures Center (NPAC) at Syracuse University, Syracuse, NY 13244 and The Rome Air Development Center, under contract F30602-88-D-0027.  相似文献   

18.
We present a randomized parallel algorithm that computes the greatest common divisor of two integers of n bits in length with probability 1−o(1) that takes O(nloglogn/logn) time using O(n6+?) processors for any ?>0 on the EREW PRAM parallel model of computation. The algorithm either gives a correct answer or reports failure.We believe this to be the first randomized sublinear time algorithm on the EREW PRAM for this problem.  相似文献   

19.
By restricting weight functions to satisfy the quadrangle inequality or the inverse quadrangle inequality, significant progress has been made in developing efficient sequential algorithms for the least-weight subsequence problem [10], [9], [12], [16]. However, not much is known on the improvement of the naive parallel algorithm for the problem, which is fast but demands too many processors (i.e., it takesO(log2 n) time on a CREW PRAM with n3/logn processors). In this paper we show that if the weight function satisfies the inverse quadrangle inequality, the problem can be solved on a CREW PRAM in O(log2 n log logn) time withn/log logn processors, or in O(log2 n) time withn logn processors. Notice that the processor-time complexity of our algorithm is much closer to the almost linear-time complexity of the best-known sequential algorithm [12].  相似文献   

20.
We present several fast algorithms for multiple addition and prefix sum on the Linear Array with a Reconfigurable Pipelined Bus System (LARPBS), a recently proposed architecture based on optical buses. Our algorithm for adding N integers runs on an N log M-processor LARPBS in O(log* N) time, where log* N is the number of times logarithm has to be taken to reduce N below 1 and M is the largest integer in the input. Our addition algorithm improves the time complexity of several matrix multiplication algorithms proposed by Li, Pan and Zheng (IEEE Trans. Parallel and Distributed Systems, 9(8):705–720, 1998). We also present several fast algorithms for computing prefix sums of N integers on the LARPBS. For integers with bounded magnitude, our first algorithm for prefix sum computation runs in O(log log N) time using N processors and in O(1) time using N 1+ processors, for < 1. For integers with unbounded magnitude, the first algorithm for multiple addition runs in O(log log N log* N) time using N log M processors, when M is the largest integer in the input. Our second algorithm for multiple addition runs in O(log* N) time using N 1+ log M processors, for < 1. We also show suitable extensions of our algorithm for real numbers.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号