首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Nested dissection is a very popular direct method for solving sparse linear systems that arise from finite difference and finite element methods. Worley and Schreiber [16] give a fine grain algorithm for a square array of processors. Their algorithm uses O(N2) processors, each with O(N) memory, to factor an N2 by N2 sparse matrix whose graphs is an N × N mesh. The efficiency of their method is between 1/46 and 1/12. George et al. [6] [8] give a medium grain algorithm for hypercube architecture, while George et al. [7] give an algorithm for shared memory machines. These papers present a column oriented approach which can exploit O(N) parallelism and yield efficiencies up to 50%. Lucas [11] also gives a column oriented scheme which achieves up to 75% efficiency and O(N) parallelism. In this paper, we present a medium to fine grain algorithm for a P × P array of processors with local memory. This algorithm can exploit up to O(N2) parallelism. The efficiency of the fine grain version is comparable to [16] while as a medium grain algorithm achieves about 49% efficiency. The strength of the method is due to three factors: its ability to pipeline much of the computation, overlapping computation and communication, and the use of level 3 BLAS like primitives. In addition to its high efficiency its memory requirement is optimal, only O(N2 log N/P2) words memory is needed per processor.  相似文献   

2.
A heap structure designed for secondary storage is suggested that tries to make the best use of the available buffer space in primary memory. The heap is a complete multi-way tree, with multi-page blocks of records as nodes, satisfying a generalized heap property. A special feature of the tree is that the nodes may be partially filled, as in B-trees. The structure is complemented with priority-queue operations insert and delete-max. When handling a sequence of S operations, the number of page transfers performed is shown to be O(∑i = 1S(1/P) log(M/P)(Ni/P)), where P denotes the number of records fitting into a page, M the capacity of the buffer space in records, and Ni, the number of records in the heap prior to the ith operation (assuming P 1 and S> M c · P, where c is a small positive constant). The number of comparisons required when handling the sequence is O(∑i = 1S log2 Ni). Using the suggested data structure we obtain an optimal external heapsort that performs O((N/P) log(M/P)(N/P)) page transfers and O(N log2 N) comparisons in the worst case when sorting N records.  相似文献   

3.
The paper presents parallel algorithms for solving Poisson equation at N2 mesh points. The methods based on marching techniques are structured for efficient parallel realization. Using orthogonal decomposition properties of arising matrices, the algorithms can be formulated in terms of transformed vectors. On a MIMD computer with not more than N processors, the computations can be performed in horizontal slices with minimal synchronization requirements. Considering an SIMD machine with N2 processors, the complexity bound O(log N) has been achieved, whereby the single marching requires 10 log N steps only.  相似文献   

4.
This paper describes several parallel algorithms for image edge relaxation on array processors with different numbers of processing elements (PEs) connected by a mesh or hypercube network. The time complexity of Prager's original edge relaxation scheme is O(N2) per iteration using floating-point operations on a sequential machine, where N2 is the number of pixels in the image. Modifications to the scheme are made so that no multiplications are employed and only integer operations are required. Moreover, with parallel processing, the time complexity per iteration is reduced to some constant value. A time complexity analysis on two parallel algorithms is performed. Although the algorithm on an array processor with 4N2 PEs achieved higher degree of parallelism, the algorithm with N2 PEs is preferred. Further modifications on the latter algorithm are made to accommodate to fewer PEs.  相似文献   

5.
Parallel clustering algorithms   总被引:3,自引:0,他引:3  
Clustering techniques play an important role in exploratory pattern analysis, unsupervised learning and image segmentation applications. Many clustering algorithms, both partitional clustering and hierarchical clustering, require intensive computation, even for a modest number of patterns. This paper presents two parallel clustering algorithms. For a clustering problem with N = 2n patterns and M = 2m features, the time complexity of the traditional partitional clustering algorithm on a single processor computer is O(MNK), where K is the number of clusters. The proposed algorithm on anSIMD computer with MN processors has a time complexity O(K(n + m)). The time complexity of the proposed single-link hierarchical clustering algorithm is reduced from O(MN2) of the uniprocessor algorithm to O(nN) with MN processors.  相似文献   

6.
Stphane 《Pattern recognition》1995,28(12):1993-2000
We propose a parallel thinning algorithm for binary pictures. Given an N × N binary image including an object, our algorithm computes in O(N2) the skeleton of the object, using a pyramidal decomposition of the picture. The behavior of this algorithm is studied considering a family of digitalization of the same object at a different level of resolution. With the Exclusive Read Exclusive Write (EREW) Parallel Random Access Machine (PRAM), our algorithm runs in O(log N) time using O(N2/logN) processors and it is work-optimal. The same result is obtained with high-connectivity distributed memory SIMD machines having strong hypercube and pyramid. We describe the basic operator, the pyramidal algorithm and some experimental results on the SIMD MasPar parallel machine.  相似文献   

7.
A new family of network topologies containing multiple loops is discussed in this paper. In the proposed structure, N processors are interconnected to form a graph G(m, N), m 3, where m is a parameter of the graph such that N is an even multiple of m and (m − 1) × 2[(m− l)/2]+ < N m × 2[m/2]+1. The graph G(m, N) is hamiltonian with an average node degree (3 + l/m), when m is even and exactly 3 when m is odd. Whereas, the maximum node degree is 4. The diameter of G(m, N) is upper bounded by [11m/8]+ 1. A point to point routing algorithm has been presented. Implementation of ASCEND/DESCEND algorithms in O(m) time has been discussed. It has been shown that in case of a single node failure, the diameter increases by at most 6.  相似文献   

8.
This paper describes some new techniques for the rapid evaluation and fitting of radial basic functions. The techniques are based on the hierarchical and multipole expansions recently introduced by several authors for the calculation of many-body potentials. Consider in particular the N term thin-plate spline, s(x) = Σj=1N djφ(xxj), where φ(u) = |u|2log|u|, in 2-dimensions. The direct evaluation of s at a single extra point requires an extra O(N) operations. This paper shows that, with judicious use of series expansions, the incremental cost of evaluating s(x) to within precision ε, can be cut to O(1+|log ε|) operations. In particular, if A is the interpolation matrix, ai,j = φ(xixj, the technique allows computation of the matrix-vector product Ad in O(N), rather than the previously required O(N2) operations, and using only O(N) storage. Fast, storage-efficient, computation of this matrix-vector product makes pre-conditioned conjugate-gradient methods very attractive as solvers of the interpolation equations, Ad = y, when N is large.  相似文献   

9.
For each nonempty binary word w=c1c2cq, where ci{0,1}, the nonnegative integer ∑i=1q (q+1−i)ci is called the moment of w and is denoted by M(w). Let [w] denote the conjugacy class of w. Define M([w])={M(u): u[w]}, N(w)={M(u)−M(w): u[w]} and δ(w)=max{M(u)−M(v): u,v[w]}. Using these objects, we obtain equivalent conditions for a binary word to be an -word (respectively, a power of an -word). For instance, we prove that the following statements are equivalent for any binary word w with |w|2: (a) w is an -word, (b) δ(w)=|w|−1, (c) w is a cyclic balanced primitive word, (d) M([w]) is a set of |w| consecutive positive integers, (e) N(w) is a set of |w| consecutive integers and 0N(w), (f) w is primitive and [w]St.  相似文献   

10.
This paper presents an efficient algorithm for enumerating all minimal a-b separators separating given non-adjacent vertices a and b in an undirected connected simple graph G = (V, E), Our algorithm requires O(n3Rab) time, which improves the known result of O(n4Rab) time for solving this problem, where ¦V¦= n and Rab is the number of minimal a-b separators. The algorithm can be generalized for enumerating all minimal A-B separators that separate non-adjacent vertex sets A, B < V, and it requires O(n2(nnAnb)RAB) time in this case, where na = ¦A¦, nB = ¦B¦ and rAB is the number of all minimal AB separators. Using the algorithm above as a routine, an efficient algorithm for enumerating all minimal separators of G separating G into at least two connected components is constructed. The algorithm runs in time O(n3R+Σ + n4RΣ), which improves the known result of O(n6RΣ) time, where Rσ is the number of all minimal separators of G and RΣR+Σ = ∑1i, vj) ERvivj n − 1)/2 − m)RΣ. Efficient parallelization of these algorithms is also discussed. It is shown that the first algorithm requires at most O((n/log n)Rab) time and the second one runs in time O((n/log n)R+Σ+n log nRΣ) on a CREW PRAM with O(n3) processors.  相似文献   

11.
We substantially improve the known algorithms for approximating all the complex zeros of an nth degree polynomial p(x). Our new algorithms save both Boolean and arithmetic sequential time, versus the previous best algorithms of Schönhage [1], Pan [2], and Neff and Reif [3]. In parallel (NC) implementation, we dramatically decrease the number of processors, versus the parallel algorithm of Neff [4], which was the only NC algorithm known for this problem so far. Specifically, under the simple normalization assumption that the variable x has been scaled so as to confine the zeros of p(x) to the unit disc x : |x| ≤ 1, our algorithms (which promise to be practically effective) approximate all the zeros of p(x) within the absolute error bound 2b, by using order of n arithmetic operations and order of (b + n)n2 Boolean (bitwise) operations (in both cases up to within polylogarithmic factors). The algorithms allow their optimal (work preserving) NC parallelization, so that they can be implemented by using polylogarithmic time and the orders of n arithmetic processors or (b + n)n2 Boolean processors. All the cited bounds on the computational complexity are within polylogarithmic factors from the optimum (in terms of n and b) under both arithmetic and Boolean models of computation (in the Boolean case, under the additional (realistic) assumption that n = O(b)).  相似文献   

12.
In this paper new methods of discretization (integer approximation) of algebraic spatial curves in the form of intersecting surfaces P(x, y, z) = 0 and Q(x, y, z) = 0 are analyzed.

The use of homogeneous cubical grids G(h3) to discretize a curve is the essence of the method. Two new algorithms of discretization (on 6-connected grid G6c(h3) and 26-connected grid G26(h3)) are presented based on the method above. Implementation of the algorithms for algebraic spatial curves is suggested. The elaborated algorithms are adjusted for application in computer graphics and numerical control of machine tools.  相似文献   


13.
For an ordered set W = {w1, w2,…, wk} of vertices and a vertex v in a connected graph G, the (metric) representation of v with respect to W is the k-vector r(v | W) = (d(v, w1), d(v, w2),…, d(v, wk)), where d(x, y) represents the distance between the vertices x and y. The set W is a resolving set for G if distinct vertices of G have distinct representations. A new sharp lower bound for the dimension of a graph G in terms of its maximum degree is presented.

A resolving set of minimum cardinality is a basis for G and the number of vertices in a basis is its (metric) dimension dim(G). A resolving set S of G is a minimal resolving set if no proper subset of S is a resolving set. The maximum cardinality of a minimal resolving set is the upper dimension dim+(G). The resolving number res(G) of a connected graph G is the minimum k such that every k-set W of vertices of G is also a resolving set of G. Then 1 ≤ dim(G) ≤ dim+(G) ≤ res(G) ≤ n − 1 for every nontrivial connected graph G of order n. It is shown that dim+(G) = res(G) = n − 1 if and only if G = Kn, while dim+(G) = res(G) = 2 if and only if G is a path of order at least 4 or an odd cycle.

The resolving numbers and upper dimensions of some well-known graphs are determined. It is shown that for every pair a, b of integers with 2 ≤ ab, there exists a connected graph G with dim(G) = dim+(G) = a and res(G) = b. Also, for every positive integer N, there exists a connected graph G with res(G) − dim+(G) ≥ N and dim+(G) − dim(G) ≥ N.  相似文献   


14.
We show that given any family of asymptotically stabilizable LTI systems depending continuously on a parameter that lies in some subset [a1,b1]××[ap,bp] of , there exists a C0 time-varying state feedback law v(t,x) (resp. a C0 time-invariant feedback law v(x)) which robustly globally exponentially stabilizes (resp. which robustly stabilizes, not asymptotically) the family. Further, if these systems are obtained by linearizing some nonlinear systems, then v(t,x) locally exponentially stabilizes these nonlinear systems. Finally, v(t,x) globally exponentially stabilizes any time-varying system which switches “slowly enough” between the given LTI systems.  相似文献   

15.
In this paper, a distributed selectsort algorithm and a parameterized selectsort algorithm are presented to be applied on distributed systems for cases when N P where N is the number of elements to be sorted and P is the number of processors in the system. The distributed system considered in this paper uses a broadcasting channel for communication between processors. We show that the number of messages required for the parameterized selectsort algorithm is independent of N and is of complexity O(P), which is optimal in a distributed system with P processors. Furthermore, the amount of communication required in terms of elements is N + O(P3) and the computation time complexity is O((N/P)lgN + P2lg(N/P)). Hence, when N P3, the computation time complexity is O((N/P)lgN), which is optimal using P processors. In addition, this parameterized algorithm provides us with a parameter K such that by choosing the value of K allows us to trade among processing requirement, memory requirement, and communication requirement. It is shown that this parameterized algorithm can reduce the communication requirements significantly while only slightly increasing the computation requirements.  相似文献   

16.
The problem of finding a rectilinear minimum bend path (RMBP) between two designated points inside a rectilinear polygon has applications in robotics and motion planning. In this paper, we present efficient algorithms to solve the query version of the RMBP problem for special classes of rectilinear polygons given their visibility graphs. Specifically, we show that given an unweighted graph G = (V, E), with ¦V¦ = N and ¦E¦ = M, algorithms to preprocess G in linear space and time such that the shortest distance queries — queries asking for the distance between any pair of nodes in the graph — can be answered in constant time and space are presented in this paper. For the case of a chordal graph G, our algorithms give a distance which is at most one away from the actual shortest distance. When G is a K-chordal graph, our algorithm produces an exact shortest distance in O(K) time. We also present a non-trivial parallel implementation of the sequential preprocessing algorithm for the CREW-PRAM model which runs in O(log2 N) time using O(N + M) processors. After the preprocessing, we can answer the queries in constant time using a single processor.  相似文献   

17.
Parallel algorithms for solving the satisfaction problem of non-trivial functional and multivalued data dependencies (FDs and MVDs) in a relation of N tuples by M processors are developed in this paper. Algorithms performing, in a parallel manner, batch or interactive checking of these data dependencies are also discussed. The M processors are organized as a linear systolic array. The time complexities of the first two algorithms for solving the FD satisfaction problem under M N are both O(N), and that of Algorithm (3) or (4) for solving the FD or MVD satisfaction problem under N M is O(N2/M). The latter complexity reduced to O(N) if N = M and is at least not worse than O(N log N) if N = M (N/log N).  相似文献   

18.
In this note, we deal with the following problem: given X Rn, a multification gG : X → 2X, two (single-valued) maps f : X → Rn, η : X × X → Rn, find a point x* X such that x* Γ (x*) and f(x*), η(x,x*) ≥ 0 for all x Γ(x*). We prove an existence theorem in which, in particular, the function f is not supposed to be continuous.  相似文献   

19.
It is pointed out in this brief paper that the l1 optimization problem minQ ε lqp1 | HU * Q * V |1, H ε lmn1, U ε lmq1, V ε lpn1 can be solved in one step rather than two. The solution of the dual problem is obviated by the direct solution of the primal problem via linear programming. The method here is applicable to finite-dimensional problems or approximating finite-dimensional problems, in the general case.  相似文献   

20.
Kuo-Liang  Wan-Yu 《Pattern recognition》2003,36(12):2793-2804
Thresholding is a fundamental operation in image processing. Based on the pairwise nearest neighbor technique and the variance criterion, this theme presents two fast adaptive thresholding algorithms. The proposed first algorithm takes O((mk)mτ) time where k denotes the number of thresholds specified by the user; m denotes the size of the compact image histogram, and the parameter τ has the constraint 1τm. On a set of different real images, experimental results reveal that the proposed first algorithm is faster than the previous three algorithms considerably while having a good feature-preserving capability. The previous three mentioned algorithms need O(mk) time. Given a specific peak-signal-to-noise ratio (PSNR), we further present the second thresholding algorithm to determine the number of thresholds as few as possible in order to obtain a thresholded image satisfying the given PSNR. The proposed second algorithm takes O((mk)mτ+γN) time where N and γ denote the image size and the fewest number of thresholds required, respectively. Some experiments are carried out to demonstrate the thresholded images that are encouraging. Since the time complexities required in our proposed two thresholding algorithms are polynomial, they could meet the real-time demand in image preprocessing.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号