首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
An O(n2) time serial algorithm is developed for obtaining the medial axis transform (MAT) of an n×n image. An O(log n) time CREW PRAM algorithm and an O(log2 n) time SIMD hypercube parallel algorithm for the MAT are also developed. Both of these use O(n2) processors. Two problems associated with the MAT, the area and perimeter reporting problem, are studied. An O(log n) time hypercube algorithm is developed for both of them, where n is the number of squares in the MAT, and the algorithms use O(n2) processors  相似文献   

2.
A novel discrete relaxation architecture   总被引:1,自引:0,他引:1  
The discrete relaxation algorithm (DRA) is a computational technique that enforces arc consistency (AC) in a constraint satisfaction problem (CSP). The original sequential AC-1 algorithm suffers from O(n3m3) time complexity, and even the optimal sequential AC-4 algorithm is O (n2m2) for an n-object and m-label DRA problem. Sample problem runs show that these algorithms are all too slow to meet the need for any useful, real-time CSP applications. A parallel DRA5 algorithm that reaches a lower bound of O(nm) (where the number of processors is polynomial in the problem size) is given. A fine-grained, massively parallel hardware computer architecture has been designed for the DRA5 algorithm. For practical problems, many orders of magnitude of efficiency improvement can be reached on such a hardware architecture  相似文献   

3.
The transitive closure problem in O(1) time is solved by a new method that is far different from the conventional solution method. On processor arrays with reconfigurable bus systems, two O (1) time algorithms are proposed for computing the transitive closure of an undirected graph. One is designed on a three-dimensional n×n×n processor array with a reconfigurable bus system, and the other is designed on a two-dimensional n2×n2 processor array with a reconfigurable bus system, where n is the number of vertices in the graph. Using the O(1) time transitive closure algorithms, many other graph problems are solved in O(1) time. These problems include recognizing bipartite graphs and finding connected components, articulation points, biconnected components, bridges, and minimum spanning trees in undirected graphs  相似文献   

4.
Using a directed acyclic graph (DAG) model of algorithms, the paper focuses on time-minimal multiprocessor schedules that use as few processors as possible. Such a processor-time-minimal scheduling of an algorithm's DAG first is illustrated using a triangular shaped 2-D directed mesh (representing, for example, an algorithm for solving a triangular system of linear equations). Then, algorithms represented by an n×n×n directed mesh are investigated. This cubical directed mesh is fundamental; it represents the standard algorithm for computing matrix product as well as many other algorithms. Completion of the cubical mesh required 3n-2 steps. It is shown that the number of processing elements needed to achieve this time bound is at least [3n2/4]. A systolic array for the cubical directed mesh is then presented. It completes the mesh using the minimum number of steps and exactly [3n 2/4] processing elements it is processor-time-minimal. The systolic array's topology is that of a hexagonally shaped, cylindrically connected, 2-D directed mesh  相似文献   

5.
A distributed knot detection algorithm for general graphs is presented. The knot detection algorithm uses at most O(n log n+m) messages and O(m+n log n) bits of memory to detect all knots' nodes in the network (where n is the number of nodes and m is the number of links). This is compared to O(n2) messages needed in the best algorithm previously published. The knot detection algorithm makes use of efficient cycle detection and clustering techniques. Various applications for the knot detection algorithms are presented. In particular, its importance to deadlock detection in store and forward communication networks and in transaction systems is demonstrated  相似文献   

6.
Even though exact algorithms exist for permutation routine of n2 messages on a n×n mesh of processors which require constant size queues, the constants are very large and the algorithms very complicated to implement. A novel, simple heuristic for the above problem is presented. It uses constant and very small size queues (size=2). For all the simulations run on randomly generated data, the number of routing steps that is required by the algorithm is almost equal to the maximum distance a packet has to travel. A pathological case is demonstrated where the routing takes more than the optimal, and it is proved that the upper bound on the number of required steps is O(n2). Furthermore, it is shown that the heuristic routes in optimal time inversion, transposition, and rotations, three special routing problems that appear very often in the design of parallel algorithms  相似文献   

7.
A new parallel algorithm is proposed for fat image labeling using local operators on image pixels. The algorithm can be implemented on an n×n mesh-connected computer such that, for any integer k in the range [1, log (2n)], the algorithm requires Θ(kn1k/) bits of local memory per processor and takes Θ(kn) time. Bit-serial processors and communication links can be used without affecting the asymptotic time complexity of the algorithm. The time complexity of the algorithm has very small leading constant factors, which makes it superior to previous mesh computer labeling algorithms for most practical image sizes (e.g. up to 4096×4096 images). Furthermore, the algorithm is based on using stacks that can be realized using very fast shift registers within each processing element  相似文献   

8.
Considers the problem of determining whether each point in a polytope n×n matrices is stable. The approach is to check stability of certain faces of the polytope. For n⩾3, the authors show that stability of each point in every (2n-4)-dimensional face guarantees stability of the entire polytope. Furthermore, they prove that, for any kn2, there exists a k-dimensional polytope containing a strictly unstable point and such that all its subpolytopes of dimension min {k-1,2n-5} are stable  相似文献   

9.
In the above-titled paper (ibid., vol.12, no.11, p.1088-92, Nov. 1990), parallel implementations of hierarchical clustering algorithms that achieve O(n2) computational time complexity and thereby improve on the baseline of sequential implementations are described. The latter are stated to be O( n3), with the exception of the single-link method. The commenter points out that state-of-the-art hierarchical clustering algorithms have O(n2) time complexity and should be referred to in preference to the O(n3 ) algorithms, which were described in many texts in the 1970s. Some further references in the parallelizing of hierarchic clustering algorithms are provided  相似文献   

10.
An adaptive parallel algorithm for inducing a priority queue structure on an n-element array is presented. The algorithm is extended to provide optimal parallel construction algorithms for three other heap-like structures useful in implementing double-ended priority queues, namely min-max heaps, deeps, and min-max-pair heaps. It is shown that an n-element array can be made into a heap, a deap, a min-max heap, or a min-max-pair heap in O(log n+(n /p)) time using no more than n/log n processors, in the exclusive-read-exclusive-write parallel random-access machine model  相似文献   

11.
An efficient parallel algorithm is presented for convolution on a mesh-connected computer with wraparound. The algorithm does not require a broadcast feature for data values, as assumed by previously proposed algorithms. As a result, the algorithm is applicable to both SIMD and MIMD meshes. For an N×N image and a M×M template, the previous algorithms take O (M2q) time on an N×N mesh-connected multicomputer (q is the number of bits in each entry of the convolution matrix). The algorithms have complexity O(M2r), where r=max {number of bits in an image entry, number of bits in a template entry}. In addition to not requiring a broadcast capability, these algorithms are faster for binary images  相似文献   

12.
The number of distinct entries among the m2n entries of the nth Kronecker power of an m×m matrix is derived. An algorithm to find the value of each entry of the Kronecker power is presented  相似文献   

13.
Memory and processing architecture for 3D voxel-based imagery   总被引:1,自引:0,他引:1  
A versatile voxel-based architecture for 3-D volume visualization, called the Cube architecture, is introduced. A small-scale prototype of the architecture has been realized in hardware and has been operating in true real-time, faster than the alternative voxel systems. The Cube architecture is centered around a 3-D cubic frame buffer, of voxels, and it entertains three processors that access the frame buffer to input sampled and synthetic data, to manipulate the 3-D images, and to project and render them. To cope with the huge quantity of voxels and still perform in real-time, two special features were incorporated within the architecture: a unique skewed memory organization, which permits the retrieval and storage of voxels in parallel, and a multiple-write bus, which speeds up the viewing process. These features allow Cube, for example, to project an image of n3 voxels in O(n 2 log n) time rather than the conventional O( n3) time  相似文献   

14.
A linear-time algorithm is developed to perform all odd (even) length circular shifts of data in an SIMD (single-instruction-stream, multiple-data-stream) hypercube. As an application, the algorithm is used to obtain an O(M2+log N) time and O(1) memory per processor algorithm to compute the two-dimensional convolution of an N×N image and an M×M template on an N2 processor SIMD hypercube. This improves the previous best complexity of O(M2 log M+log N)  相似文献   

15.
Semigroup and prefix computations on two-dimensional mesh-connected computers with multiple broadcasting (2-MCCMBs) are studied. Previously, only square 2-MCCMBs with N processing elements were considered for semigroup computations of N data items, and O(N1/6) time was required. It is found that square machines are not the best form for semigroup computations, and an O(N1/8)-time algorithm is derived on an N5/8×N3/8 rectangular 2-MCCMB. This time complexity can be further reduced to O(N1/9) if fewer processing elements are used. Parallel algorithms for prefix computations with the same time complexities are derived  相似文献   

16.
Using a directed acyclic graph (DAG) model of algorithms, the authors focus on processor-time-minimal multiprocessor schedules: time-minimal multiprocessor schedules that use as few processors as possible. The Kung, Lo, and Lewis (KLL) algorithm for computing the transitive closure of a relation over a set of n elements requires at least 5n-4 parallel steps. As originally reported. their systolic array comprises n2 processing elements. It is shown that any time-minimal multiprocessor schedule of the KLL algorithm's dag needs at least n2/3 processing elements. Then a processor-time-minimal systolic array realizing the KLL dag is constructed. Its processing elements are organized as a cylindrically connected 2-D mesh, when n=0 mod 3. When n≠0 mod 3, the 2-D mesh is connected as a torus  相似文献   

17.
Most existing methods of mapping algorithms into processor arrays are restricted to the case where n-dimensional algorithms, or algorithms with n nested loops, are mapped into (n-1)-dimensional arrays. However, in practice, it is interesting to map n-dimensional algorithms into (k-1)-dimensional arrays where k<n. A computational conflict occurs if two or more computations of an algorithm are mapped into the same execution time. Based on the Hermite normal form of the mapping matrix, necessary and sufficient conditions are derived to identify mapping without computational conflicts. These conditions are used to find time mappings of n-dimensional algorithms into (k-1)-dimensional arrays, k<n , without computational conflicts. For some applications, the mapping is time-optimal  相似文献   

18.
The problem of distributed leader election in an asynchronous complete network, in the presence of faults that occurred prior to the execution of the election algorithm, is discussed. Failures of this type are encountered, for example, during a recovery from a crash in the network. For a network with n processors, k of which start the algorithm that uses at most O(n log k +n+kt) messages is presented and shown to be optimal. An optimal algorithm for the case where the identities of the neighbors are known is also presented. It is noted that the order of the message complexity of a t-resilient algorithm is not always higher than that of a nonresilient one. The t-resilient algorithm is a systematic modification of an existing algorithm for a fault-free network  相似文献   

19.
A hypercube algorithm to solve the list ranking problem is presented. Let n be the length of the list, and let p be the number of processors of the hypercube. The algorithm described runs in time O(n/p) when n=Ω(p 1+ε) for any constant ε>0, and in time O(n log n/p+log3 p) otherwise. This clearly attains a linear speedup when n=Ω(p 1+ε). Efficient balancing and routing schemes had to be used to achieve the linear speedup. The authors use these techniques to obtain efficient hypercube algorithms for many basic graph problems such as tree expression evaluation, connected and biconnected components, ear decomposition, and st-numbering. These problems are also addressed in the restricted model of one-port communication  相似文献   

20.
A linear programming (LP) approach is proposed for the weighted graph matching problem. A linear program is obtained by formulating the graph matching problem in L1 norm and then transforming the resulting quadratic optimization problem to a linear one. The linear program is solved using a simplex-based algorithm. Then, approximate 0-1 integer solutions are obtained by applying the Hungarian method on the real solutions of the linear program. The complexity of the proposed algorithm is polynomial time, and it is O(n 6L) for matching graphs of size n. The developed algorithm is compared to two other algorithms. One is based on an eigendecomposition approach and the other on a symmetric polynomial transform. Experimental results showed that the LP approach is superior in matching graphs than both other methods  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号