首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
An O(n2) time serial algorithm is developed for obtaining the medial axis transform (MAT) of an n×n image. An O(log n) time CREW PRAM algorithm and an O(log2 n) time SIMD hypercube parallel algorithm for the MAT are also developed. Both of these use O(n2) processors. Two problems associated with the MAT, the area and perimeter reporting problem, are studied. An O(log n) time hypercube algorithm is developed for both of them, where n is the number of squares in the MAT, and the algorithms use O(n2) processors  相似文献   

2.
An algorithm for convolving a k×k window of weighting coefficients with an n×n image matrix on a pyramid computer of O(n2) processors in time O(logn+k2), excluding the time to load the image matrix, is presented. If k=Ω (√log n), which is typical in practice, the algorithm has a processor-time product O(n 2 k2) which is optimal with respect to the usual sequential algorithm. A feature of the algorithm is that the mechanism for controlling the transmission and distribution of data in each processor is finite state, independent of the values of n and k. Thus, for convolving two {0, 1}-valued matrices using Boolean operations rather than the typical sum and product operations, the processors of the pyramid computer are finite-state  相似文献   

3.
Semigroup and prefix computations on two-dimensional mesh-connected computers with multiple broadcasting (2-MCCMBs) are studied. Previously, only square 2-MCCMBs with N processing elements were considered for semigroup computations of N data items, and O(N1/6) time was required. It is found that square machines are not the best form for semigroup computations, and an O(N1/8)-time algorithm is derived on an N5/8×N3/8 rectangular 2-MCCMB. This time complexity can be further reduced to O(N1/9) if fewer processing elements are used. Parallel algorithms for prefix computations with the same time complexities are derived  相似文献   

4.
Even though exact algorithms exist for permutation routine of n2 messages on a n×n mesh of processors which require constant size queues, the constants are very large and the algorithms very complicated to implement. A novel, simple heuristic for the above problem is presented. It uses constant and very small size queues (size=2). For all the simulations run on randomly generated data, the number of routing steps that is required by the algorithm is almost equal to the maximum distance a packet has to travel. A pathological case is demonstrated where the routing takes more than the optimal, and it is proved that the upper bound on the number of required steps is O(n2). Furthermore, it is shown that the heuristic routes in optimal time inversion, transposition, and rotations, three special routing problems that appear very often in the design of parallel algorithms  相似文献   

5.
Parallel algorithms for several important combinatorial problems such as the all nearest smaller values problem, triangulating a monotone polygon, and line packing are presented. These algorithms achieve linear speedups on the pipelined hypercube, and provably optimal speedups on the shuffle-exchange and the cube-connected-cycles for any number p of processors satisfying 1⩽pn/((log3n)(loglog n)2), where n is the input size. The lower bound results are established under no restriction on how the input is mapped into the local memories of the different processors  相似文献   

6.
Optimal broadcasting on the star graph   总被引:2,自引:0,他引:2  
The star graph has been show to be an attractive alternative to the widely used n-cube. Like the n-cube, the star graph possesses rich structure and symmetry as well as fault tolerant capabilities, but has a smaller diameter and degree. However, very few algorithms exists to show its potential as a multiprocessor interconnection network. Many fast and efficient parallel algorithms require broadcasting as a basic step. An optimal algorithm for one-to-all broadcasting in the star graph is proposed. The algorithm can broadcast a message to N processors in O(log2 N) time. The algorithm exploits the rich structure of the star graph and works by recursively partitioning the original star graph into smaller star graphs. In addition, an optimal all-to-all broadcasting algorithm is developed  相似文献   

7.
A hypercube algorithm to solve the list ranking problem is presented. Let n be the length of the list, and let p be the number of processors of the hypercube. The algorithm described runs in time O(n/p) when n=Ω(p 1+ε) for any constant ε>0, and in time O(n log n/p+log3 p) otherwise. This clearly attains a linear speedup when n=Ω(p 1+ε). Efficient balancing and routing schemes had to be used to achieve the linear speedup. The authors use these techniques to obtain efficient hypercube algorithms for many basic graph problems such as tree expression evaluation, connected and biconnected components, ear decomposition, and st-numbering. These problems are also addressed in the restricted model of one-port communication  相似文献   

8.
A linear-time algorithm is developed to perform all odd (even) length circular shifts of data in an SIMD (single-instruction-stream, multiple-data-stream) hypercube. As an application, the algorithm is used to obtain an O(M2+log N) time and O(1) memory per processor algorithm to compute the two-dimensional convolution of an N×N image and an M×M template on an N2 processor SIMD hypercube. This improves the previous best complexity of O(M2 log M+log N)  相似文献   

9.
Using a directed acyclic graph (DAG) model of algorithms, the authors focus on processor-time-minimal multiprocessor schedules: time-minimal multiprocessor schedules that use as few processors as possible. The Kung, Lo, and Lewis (KLL) algorithm for computing the transitive closure of a relation over a set of n elements requires at least 5n-4 parallel steps. As originally reported. their systolic array comprises n2 processing elements. It is shown that any time-minimal multiprocessor schedule of the KLL algorithm's dag needs at least n2/3 processing elements. Then a processor-time-minimal systolic array realizing the KLL dag is constructed. Its processing elements are organized as a cylindrically connected 2-D mesh, when n=0 mod 3. When n≠0 mod 3, the 2-D mesh is connected as a torus  相似文献   

10.
A novel discrete relaxation architecture   总被引:1,自引:0,他引:1  
The discrete relaxation algorithm (DRA) is a computational technique that enforces arc consistency (AC) in a constraint satisfaction problem (CSP). The original sequential AC-1 algorithm suffers from O(n3m3) time complexity, and even the optimal sequential AC-4 algorithm is O (n2m2) for an n-object and m-label DRA problem. Sample problem runs show that these algorithms are all too slow to meet the need for any useful, real-time CSP applications. A parallel DRA5 algorithm that reaches a lower bound of O(nm) (where the number of processors is polynomial in the problem size) is given. A fine-grained, massively parallel hardware computer architecture has been designed for the DRA5 algorithm. For practical problems, many orders of magnitude of efficiency improvement can be reached on such a hardware architecture  相似文献   

11.
A closed-form expression has been reported in the literature for LN, the number of digital line segments of length N that correspond to lines of the form y=ax+β, O⩽α, β<1. The authors prove an asymptotic estimate for LN that might prove useful for many applications, namely, LN=N 32+O(N2 log N). An application to an image registration problem is given  相似文献   

12.
In the above-titled paper (ibid., vol.12, no.11, p.1088-92, Nov. 1990), parallel implementations of hierarchical clustering algorithms that achieve O(n2) computational time complexity and thereby improve on the baseline of sequential implementations are described. The latter are stated to be O( n3), with the exception of the single-link method. The commenter points out that state-of-the-art hierarchical clustering algorithms have O(n2) time complexity and should be referred to in preference to the O(n3 ) algorithms, which were described in many texts in the 1970s. Some further references in the parallelizing of hierarchic clustering algorithms are provided  相似文献   

13.
In a general algebraic framework, starting with a bicoprime factorization P=NprD-1 Npl, a right-coprime factorization Np Dp-1, a left-coprime factorization D-1pNp, and the generalized Bezout identities associated with the pairs (Np, Dp) and (D˜ p, N˜p) are obtained. The set of all H-stabilizing compensators for P in the unity-feedback configuration S(P, C) are expressed in terms of (Npr, D, N pt) and the elements of the Bezout identity. The state-space representation P=C(sI-A)-1B is included as an example  相似文献   

14.
Memory and processing architecture for 3D voxel-based imagery   总被引:1,自引:0,他引:1  
A versatile voxel-based architecture for 3-D volume visualization, called the Cube architecture, is introduced. A small-scale prototype of the architecture has been realized in hardware and has been operating in true real-time, faster than the alternative voxel systems. The Cube architecture is centered around a 3-D cubic frame buffer, of voxels, and it entertains three processors that access the frame buffer to input sampled and synthetic data, to manipulate the 3-D images, and to project and render them. To cope with the huge quantity of voxels and still perform in real-time, two special features were incorporated within the architecture: a unique skewed memory organization, which permits the retrieval and storage of voxels in parallel, and a multiple-write bus, which speeds up the viewing process. These features allow Cube, for example, to project an image of n3 voxels in O(n 2 log n) time rather than the conventional O( n3) time  相似文献   

15.
Parallel algorithms on SIMD (single-instruction stream multiple-data stream) machines for hierarchical clustering and cluster validity computation are proposed. The machine model uses a parallel memory system and an alignment network to facilitate parallel access to both pattern matrix and proximity matrix. For a problem with N patterns, the number of memory accesses is reduced from O(N 3) on a sequential machine to O(N2) on an SIMD machine with N PEs  相似文献   

16.
Using a directed acyclic graph (DAG) model of algorithms, the paper focuses on time-minimal multiprocessor schedules that use as few processors as possible. Such a processor-time-minimal scheduling of an algorithm's DAG first is illustrated using a triangular shaped 2-D directed mesh (representing, for example, an algorithm for solving a triangular system of linear equations). Then, algorithms represented by an n×n×n directed mesh are investigated. This cubical directed mesh is fundamental; it represents the standard algorithm for computing matrix product as well as many other algorithms. Completion of the cubical mesh required 3n-2 steps. It is shown that the number of processing elements needed to achieve this time bound is at least [3n2/4]. A systolic array for the cubical directed mesh is then presented. It completes the mesh using the minimum number of steps and exactly [3n 2/4] processing elements it is processor-time-minimal. The systolic array's topology is that of a hexagonally shaped, cylindrically connected, 2-D directed mesh  相似文献   

17.
The performance evaluation of processor-memory communications for multiprocessor systems using circuit switched interconnection networks with a hold strategy is performed. Message size and processor processing time are considered and shown to have a significant effect on the overall system performance. A closed queuing network model is proposed such that only (n+2) states are required by the proposed model, in contrast to (n2+3n+4)/2 states needed in previous studies, where n is the number of stages of the multistage interconnection network. Since a closed-form solution is obtained, the behavior of a complete cycle of memory access through multistage interconnection networks can be accurately analyzed and various performance bounds can be obtained  相似文献   

18.
An adaptive parallel algorithm for inducing a priority queue structure on an n-element array is presented. The algorithm is extended to provide optimal parallel construction algorithms for three other heap-like structures useful in implementing double-ended priority queues, namely min-max heaps, deeps, and min-max-pair heaps. It is shown that an n-element array can be made into a heap, a deap, a min-max heap, or a min-max-pair heap in O(log n+(n /p)) time using no more than n/log n processors, in the exclusive-read-exclusive-write parallel random-access machine model  相似文献   

19.
A parallel sorting algorithm for sorting n elements evenly distributed over 2d p nodes of a d-dimensional hypercube is presented. The average running time of the algorithm is O((n log n)/p+p log 2n). The algorithm maintains a perfect load balance in the nodes by determining the (kn/p)th elements (k1,. . ., (p-1)) of the final sorted list in advance. These p-1 keys are used to partition the sorted sublists in each node to redistribute data to the nodes to be merged in parallel. The nodes finish the sort with an equal number of elements (n/ p) regardless of the data distribution. A parallel selection algorithm for determining the balanced partition keys in O(p log2n) time is presented. The speed of the sorting algorithm is further enhanced by the distance-d communication capability of the iPSC/2 hypercube computer and a novel conflict-free routing algorithm. Experimental results on a 16-node hypercube computer show that the sorting algorithm is competitive with the previous algorithms and faster for skewed data distributions  相似文献   

20.
Computing the width of a set   总被引:1,自引:0,他引:1  
For a set of points P in three-dimensional space, the width of P, W (P), is defined as the minimum distance between parallel planes of support of P. It is shown that W(P) can be computed in O(n log n +I) time and O(n) space, where I is the number of antipodal pairs of edges of the convex hull of P, and n is the number of vertices; in the worst case, I=O( n2). For a convex polyhedra the time complexity becomes O(n+I). If P is a set of points in the plane, the complexity can be reduced to O(nlog n). For simple polygons, linear time suffices  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号