首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 906 毫秒
1.
This paper describes a parallel algorithm for computing the visible portion of a simple planar polygon with N vertices from a given point on or inside the polygon. The algorithm accomplishes this in O(k log N) time using O(N/log N) processors, where k is the link-diameter of the polygon in consideration. The link-diameter of a polygon is the maximum number of straight line segments needed to connect any two points within the polygon, where all line segments lie completely within the polygon. The algorithm can also be used to compute the visible portion of the plane given a point outside of the polygon. Except in this case, the parameter k in the asymptotic bounds would be the link diameter of a different polygon. The algorithm is optimal for sets of polygons that have a constant link diameter. It is a rather simple algorithm, and has a very small run time constant, making it fast and practical to implement. The interprocessor communication needed involves only local neighbor communication and scan operations (i.e., parallel prefix operations). Thus the algorithm can be implemented not only on an EREW PRAM, but also on a variety of other more practical machine architectures, such as hypercubes, trees, butterflies, and shuffle exchange networks. The algorithm was implemented on the Connection Machine as well as the MasPar MP- 1, and various performance tests were conducted.  相似文献   

2.
Thedistance transform(DT) is an image computation tool which can be used to extract the information about the shape and the position of the foreground pixels relative to each other. It converts a binary image into a grey-level image, where each pixel has a value corresponding to the distance to the nearest foreground pixel. The time complexity for computing the distance transform is fully dependent on the different distance metrics. Especially, the more exact the distance transform is, the worse execution time reached will be. Nowadays, quite often thousands of images are processed in a limited time. It seems quite impossible for a sequential computer to do such a computation for the distance transform in real time. In order to provide efficient distance transform computation, it is considerably desirable to develop a parallel algorithm for this operation. In this paper, based on the diagonal propagation approach, we first provide anO(N2) time sequential algorithm to compute thechessboard distance transform(CDT) of anN×Nimage, which is a DT using the chessboard distance metrics. Based on the proposed sequential algorithm, the CDT of a 2D binary image array of sizeN×Ncan be computed inO(logN) time on the EREW PRAM model usingO(N2/logN) processors,O(log logN) time on the CRCW PRAM model usingO(N2/log logN) processors, andO(logN) time on the hypercube computer usingO(N2/logN) processors. Following the mapping as proposed by Lee and Horng, the algorithm for the medial axis transform is also efficiently derived. The medial axis transform of a 2D binary image array of sizeN×Ncan be computed inO(logN) time on the EREW PRAM model usingO(N2/logN) processors,O(log logN) time on the CRCW PRAM model usingO(N2/log logN) processors, andO(logN) time on the hypercube computer usingO(N2/logN) processors. The proposed parallel algorithms are composed of a set of prefix operations. In each prefix operation phase, only increase (add-one) operation and minimum operation are employed. So, the algorithms are especially efficient in practical applications.  相似文献   

3.
We consider the problem of generating random permutations with uniform distribution. That is, we require that for an arbitrary permutation π of n elements, with probability 1/n! the machine halts with the i th output cell containing π(i) , for 1 ≤ i ≤ n . We study this problem on two models of parallel computations: the CREW PRAM and the EREW PRAM. The main result of the paper is an algorithm for generating random permutations that runs in O(log log n) time and uses O(n 1+o(1) ) processors on the CREW PRAM. This is the first o(log n) -time CREW PRAM algorithm for this problem. On the EREW PRAM we present a simple algorithm that generates a random permutation in time O(log n) using n processors and O(n) space. This algorithm outperforms each of the previously known algorithms for the exclusive write PRAMs. The common and novel feature of both our algorithms is first to design a suitable random switching network generating a permutation and then to simulate this network on the PRAM model in a fast way. Received November 1996; revised March 1997.  相似文献   

4.
In a two- or three-dimensional image array, the computation of Euclidean distance transform (EDT) is an important task. With the increasing application of 3D voxel images, it is useful to consider the distance transform of a 3D digital image array. Because the EDT computation is a global operation, it is prohibitively time consuming when performing the EDT for image processing. In order to provide the efficient transform computations, parallelism is employed. We first derive several important geometry relations and properties among parallel planes. We then, develop a parallel algorithm for the three-dimensional Euclidean distance transform (3D-EDT) on the EREW PRAM computation model. The time complexity of our parallel algorithm is O(log/sup 2/ N) for an N/spl times/N/spl times/N image array and this is currently the best known result. A generalized parallel algorithm for the 3D-EDT is also proposed. We implement the proposed algorithms sequentially, the performance of which exceeds the existing algorithms (proposed by Yamada, 1984). Finally, we develop the corresponding parallel programs on both the emulated EREW PRAM model computer and the IBM SP2 to verify the speed-up properties of the proposed algorithms.  相似文献   

5.
Stphane 《Pattern recognition》1995,28(12):1993-2000
We propose a parallel thinning algorithm for binary pictures. Given an N × N binary image including an object, our algorithm computes in O(N2) the skeleton of the object, using a pyramidal decomposition of the picture. The behavior of this algorithm is studied considering a family of digitalization of the same object at a different level of resolution. With the Exclusive Read Exclusive Write (EREW) Parallel Random Access Machine (PRAM), our algorithm runs in O(log N) time using O(N2/logN) processors and it is work-optimal. The same result is obtained with high-connectivity distributed memory SIMD machines having strong hypercube and pyramid. We describe the basic operator, the pyramidal algorithm and some experimental results on the SIMD MasPar parallel machine.  相似文献   

6.
We present a simple algorithm for the Euclidean distance transform of a binary image that runs more efficiently than other algorithms in the literature. We show that our algorithm runs in optimal time for many architectures and has optimal cost for the RAM and EREW PRAM.  相似文献   

7.
On parallel integer sorting   总被引:1,自引:0,他引:1  
We present an optimal algorithm for sortingn integers in the range [1,n c ] (for any constantc) for the EREW PRAM model where the word length isn , for any >0. Using this algorithm, the best known upper bound for integer sorting on the (O(logn) word length) EREW PRAM model is improved. In addition, a novel parallel range reduction algorithm which results in a near optimal randomized integer sorting algorthm is presented. For the case when the keys are uniformly distributed integers in an arbitrary range, we give an algorithm whose expected running time is optimal.Supported by NSF-DCR-85-03251 and ONR contract N00014-87-K-0310  相似文献   

8.
An important midlevel task for computer vision is addressed. The problem consists of labeling connected components in N1/2 ×N2/2 binary images. This task can be solved with parallel computers by using a simple and novel algorithm. The parallel computing model used is a synchronous fine-grained shared-memory model where only one processor can read from or write to the same memory location at a given time. This model is known as the exclusive-read exclusive-write parallel RAM (EREW PRAM). Using this model, the algorithm presented has O(log N) complexity. The algorithm can run on parallel machines other than the EREW PRAM. In particular, it offers an optimal image component labeling algorithm for mesh-connected computers  相似文献   

9.
We present a randomized EREW PRAM algorithm to find a minimum spanning forest in a weighted undirected graph. On an n -vertex graph the algorithm runs in o(( log n) 1+ ɛ ) expected time for any ɛ >0 and performs linear expected work. This is the first linear-work, polylog-time algorithm on the EREW PRAM for this problem. This also gives parallel algorithms that perform expected linear work on two general-purpose models of parallel computation—the QSM and the BSP.  相似文献   

10.
In this paper we present optimal processor x time parallel algorithms for term matching and anti-unification of terms represented as trees. Term matching is the special case of unification in which one of the terms is restricted to contain no variables. It has wide applicability to logic programming, term rewriting systems and symbolic pattern matching. Anti-unification is the dual problem of unification in which one computes the most specific generalization of two terms. It has application to inductive inference and theorem proving. Our algorithms run in O(log2 N) time using N/log2 N processors on a shared-memory model of computation that prohibits simultaneous reads or writes (EREW PRAM). These algorithms are the first polylogarithmic-time EREW algorithms with a processor x time product of the same order as that of their sequential counterparts, thereby permitting optimal speed-ups using any number of processors up to N/log2 N. We also use the techniques developed in the paper to provide an N/log N-processor, O(log N)-time algorithm for a shared-memory model that allows both simultaneous reads and simultaneous writes (CRCW PRAM).Supported by NSF Grant IRI-88-09324 and NSF/DARPA Grant CCR-8908092.  相似文献   

11.
We present a unified parallel algorithm for constructing various search trees. The tree construction is based on a unified scheme, called bottom-level balancing, which constructs a perfectly balanced search tree having a uniform distribution of keys. The algorithm takes O(log log N) time using N/log log N processors on the EREW PRAM model, and O(1) time with N processors on the CREW PRAM model, where N is the number of keys in the tree.  相似文献   

12.
Considers the use of massively parallel architectures to execute a trace-driven simulation of a single cache set. A method is presented for the least-recently-used (LRU) policy, which, regardless of the set size C, runs in time O(log N) using N processors on the EREW (exclusive read, exclusive write) parallel model. A simpler LRU simulation algorithm is given that runs in O(C log N) time using N/log N processors. We present timings of this algorithm's implementation on the MasPar MP-1, a machine with 16384 processors. A broad class of reference-based line replacement policies are considered, which includes LRU as well as the least-frequently-used (LFU) and random replacement policies. A simulation method is presented for any such policy that, on any trace of length N directed to a C line set, runs in O(C log N) time with high probability using N processors on the EREW model. The algorithms are simple, have very little space overhead, and are well suited for SIMD implementation  相似文献   

13.
Traditionally, the block-based medial axis transform (BB-MAT) and the chessboard distance transform (CDT) were usually viewed as two completely different image computation problems, especially for three dimensional (3D) space. In fact, there exist some equivalent properties between them. The relationship between both of them is first derived and proved in this paper. One of the significant properties is that CDT for 3D binary image V is equal to BB-MAT for image V' where it denotes the inverse image of V. In a parallel algorithm, a cost is defined as the product of the time complexity and the number of processors used. The main contribution of this work is to reduce the costs of 3D BB-MAT and 3D CDT problems proposed by Wang [65]. Based on the reverse-dominance technique which is redefined from dominance concept, we achieve the computation of the 3D CDT problem by implementing the 3D BB-MAT algorithm first. For a 3D binary image of size N3, our parallel algorithm can be run in O(logN) time using N3 processors on the concurrent read exclusive write (CREW) parallel random access machine (PRAM) model to solve both 3D BB-MAT and 3D CDT problems, respectively. The presented results for the cost are reduced in comparison with those of Wang's. To the best of our knowledge, this work is the lowest costs for the 3D BB-MAT and 3D CDT algorithms known. In parallel algorithms, the running time can be divided into computation time and communication time. The experimental results of the running, communication and computation times for the different problem sizes are implemented in an HP Superdome with SMP/CC-NUMA (symmetric multiprocessor/cache coherent non-uniform memory access) architecture. We conclude that the parallel computer (i.e., SMP/CC-NUMA architecture or cluster system) is more suitable for solving problems with a large amount of input size.  相似文献   

14.
We present a randomized EREW PRAM algorithm to find a minimum spanning forest in a weighted undirected graph. On an n -vertex graph the algorithm runs in o(( log n)1+?) expected time for any ? >0 and performs linear expected work. This is the first linear-work, polylog-time algorithm on the EREW PRAM for this problem. This also gives parallel algorithms that perform expected linear work on two general-purpose models of parallel computation—the QSM and the BSP.  相似文献   

15.
Efficient parallel processing of image contours   总被引:1,自引:0,他引:1  
Describes two parallel algorithms for ranking the pixels on a curve in O (log N) time using either an EREW or CREW PRAM model. The algorithms accomplish this with N processors for a √N×√N image. After applying such an algorithm to an image, it is possible to move the pixels from a curve into processors having consecutive addresses. This is important because one can subsequently apply many algorithms to the curve (such as piecewise linear approximation algorithms or point in polygon tests) using segmented scan operations (i.e. parallel prefix operations). Scan operations can be executed in logarithmic time on many interconnection networks, such as hypercube, tree, butterfly, and shuffle exchange machines as well as on the EREW PRAM. The algorithms were implemented on the hypercube structured Connection Machine, and various performance tests were conducted  相似文献   

16.
An optimal parallel algorithm for volume ray casting   总被引:3,自引:0,他引:3  
Volume rendering by ray casting is computationally expensive. For interactive volume visualization, rendering must be done in real time (30 frames/s). Since the typical size of a 3D dataset is 2563, parallel processing is imperative. In this paper, we present anO(logn) EREW algorithm for volume rendering. We useO(n 3) processors that can be optimized toO(log3 n) time withO(n 3/log3 n) processors. We have implemented our algorithm on a MasPar MP-1. The implementation results show that a frame of size 2563 is generated in 11 s by 4096 processors. This time can be further reduced by the use of large number of processors.  相似文献   

17.
In this paper we present a data parallel volume rendering algorithm that possesses numerous advantages over prior published solutions. Volume rendering is a three-dimensional graphics rendering algorithm that computes views of sampled medical and simulation data, but has been much slower than other graphics algorithms because of the data set sizes and the computational complexity. Our algorithm usespermutation warpingto achieve linear speedup (run time is O(S/P) forPprocessors whenP\= O(S/logS) forS\=n3samples), linear storage (O(S)) for large data sets, arbitrary view directions, and high-quality filters. We derived a new processor permutation assignment of five passes (our prior known solution was eight passes), and a new parallel compositing technique that is essential for scaling linearly on machines that have more processors than view rays to process (P>n2). We show a speedup of 15.7 for a 16k processor over a 1k processor MasPar MP-1 (16 is linear) and two frames/second with a 1283volume and trilinear view reconstruction. In addition, we demonstrate volume sizes of 2563, constant run time over angles 5 to 75°, filter quality comparisons, and communication congestion of just 19 to 29\%.  相似文献   

18.
Let A be a sorted array of n numbers and B a sorted array of m numbers, both in nondecreasing order, with n⩽m. We consider the problem of determining, for each element A(j), j=1, 2, …, n, the unique element B(i), 0⩽i⩽m, such that B(i)⩽A(j)相似文献   

19.
《国际计算机数学杂志》2012,89(3-4):147-158
Graph coloring is an abstraction of scheduling problems. Using an exclusive-read and exclusive-write (EREW) parallel random access machine (PRAM) model, two approximate coloring algorithms are parallelized. The performance analysis reveals that the parallel largest-degree-first algorithm is efficient for regular or near-regular graphs; while the second, a costlier but more easily parallelizable algorithm, yields optimal speedup for graphs of widely varying densities.  相似文献   

20.
We present a randomized parallel algorithm that computes the greatest common divisor of two integers of n bits in length with probability 1−o(1) that takes O(nloglogn/logn) time using O(n6+?) processors for any ?>0 on the EREW PRAM parallel model of computation. The algorithm either gives a correct answer or reports failure.We believe this to be the first randomized sublinear time algorithm on the EREW PRAM for this problem.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号