期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

John M. Conroy 《Parallel Computing》1990,16(2-3):139-156

Nested dissection is a very popular direct method for solving sparse linear systems that arise from finite difference and finite element methods. Worley and Schreiber [16] give a fine grain algorithm for a square array of processors. Their algorithm uses O(N²) processors, each with O(N) memory, to factor an N² by N² sparse matrix whose graphs is an N × N mesh. The efficiency of their method is between 1/46 and 1/12. George et al. [6] [8] give a medium grain algorithm for hypercube architecture, while George et al. [7] give an algorithm for shared memory machines. These papers present a column oriented approach which can exploit O(N) parallelism and yield efficiencies up to 50%. Lucas [11] also gives a column oriented scheme which achieves up to 75% efficiency and O(N) parallelism. In this paper, we present a medium to fine grain algorithm for a P × P array of processors with local memory. This algorithm can exploit up to O(N²) parallelism. The efficiency of the fine grain version is comparable to [16] while as a medium grain algorithm achieves about 49% efficiency. The strength of the method is due to three factors: its ability to pipeline much of the computation, overlapping computation and communication, and the use of level 3 BLAS like primitives. In addition to its high efficiency its memory requirement is optimal, only O(N² log N/P²) words memory is needed per processor. 相似文献

2.

Heaps and heapsort on secondary storage

R. Fadel K. V. Jakobsen J. Katajainen J. Teuhola 《Theoretical computer science》1999,220(2):585-362

A heap structure designed for secondary storage is suggested that tries to make the best use of the available buffer space in primary memory. The heap is a complete multi-way tree, with multi-page blocks of records as nodes, satisfying a generalized heap property. A special feature of the tree is that the nodes may be partially filled, as in B-trees. The structure is complemented with priority-queue operations insert and delete-max. When handling a sequence of S operations, the number of page transfers performed is shown to be O(∑_{i = 1}^S(1/P) log_(M/P)(N_i/P)), where P denotes the number of records fitting into a page, M the capacity of the buffer space in records, and N_i, the number of records in the heap prior to the ith operation (assuming P 1 and S> M c · P, where c is a small positive constant). The number of comparisons required when handling the sequence is O(∑_{i = 1}^S log₂ N_i). Using the suggested data structure we obtain an optimal external heapsort that performs O((N/P) log_(M/P)(N/P)) page transfers and O(N log₂ N) comparisons in the worst case when sorting N records. 相似文献

3.

Parallel marching Poisson solvers

Marian Vajter&#x;ic 《Parallel Computing》1984,1(3-4):325-330

The paper presents parallel algorithms for solving Poisson equation at N² mesh points. The methods based on marching techniques are structured for efficient parallel realization. Using orthogonal decomposition properties of arising matrices, the algorithms can be formulated in terms of transformed vectors. On a MIMD computer with not more than N processors, the computations can be performed in horizontal slices with minimal synchronization requirements. Considering an SIMD machine with N² processors, the complexity bound O(log N) has been achieved, whereby the single marching requires 10 log N steps only. 相似文献

4.

Parallel processing approaches to edge relaxation

Eva Leung Xiaobo Li 《Pattern recognition》1988,21(6):547-558

This paper describes several parallel algorithms for image edge relaxation on array processors with different numbers of processing elements (PEs) connected by a mesh or hypercube network. The time complexity of Prager's original edge relaxation scheme is O(N²) per iteration using floating-point operations on a sequential machine, where N² is the number of pixels in the image. Modifications to the scheme are made so that no multiplications are employed and only integer operations are required. Moreover, with parallel processing, the time complexity per iteration is reduced to some constant value. A time complexity analysis on two parallel algorithms is performed. Although the algorithm on an array processor with 4N² PEs achieved higher degree of parallelism, the algorithm with N² PEs is preferred. Further modifications on the latter algorithm are made to accommodate to fewer PEs. 相似文献

5.

Parallel clustering algorithms 总被引：3，自引：0，他引：3

Xiaobo Li Zhixi Fang 《Parallel Computing》1989,11(3):275-290

Clustering techniques play an important role in exploratory pattern analysis, unsupervised learning and image segmentation applications. Many clustering algorithms, both partitional clustering and hierarchical clustering, require intensive computation, even for a modest number of patterns. This paper presents two parallel clustering algorithms. For a clustering problem with N = 2ⁿ patterns and M = 2^m features, the time complexity of the traditional partitional clustering algorithm on a single processor computer is O(MNK), where K is the number of clusters. The proposed algorithm on anSIMD computer with MN processors has a time complexity O(K(n + m)). The time complexity of the proposed single-link hierarchical clustering algorithm is reduced from O(MN²) of the uniprocessor algorithm to O(nN) with MN processors. 相似文献

6.

Pyramidal thinning algorithm for SIMD parallel machines

Stphane 《Pattern recognition》1995,28(12):1993-2000

We propose a parallel thinning algorithm for binary pictures. Given an N × N binary image including an object, our algorithm computes in O(N²) the skeleton of the object, using a pyramidal decomposition of the picture. The behavior of this algorithm is studied considering a family of digitalization of the same object at a different level of resolution. With the Exclusive Read Exclusive Write (EREW) Parallel Random Access Machine (PRAM), our algorithm runs in O(log N) time using O(N²/logN) processors and it is work-optimal. The same result is obtained with high-connectivity distributed memory SIMD machines having strong hypercube and pyramid. We describe the basic operator, the pyramidal algorithm and some experimental results on the SIMD MasPar parallel machine. 相似文献

7.

A family of network topologies with multiple loops and logarithmic diameter

Srabani Sen Gupta Rajib K. Das Krishnendu Mukhopadhyaya Bhabani P. Sinha 《Parallel Computing》1997,22(14):2047-2064

A new family of network topologies containing multiple loops is discussed in this paper. In the proposed structure, N processors are interconnected to form a graph G(m, N), m 3, where m is a parameter of the graph such that N is an even multiple of m and (m − 1) × 2[(^{m− l)/2}]+ < N m × 2[^m/2]+1. The graph G(m, N) is hamiltonian with an average node degree (3 + l/m), when m is even and exactly 3 when m is odd. Whereas, the maximum node degree is 4. The diameter of G(m, N) is upper bounded by [11m/8]+ 1. A point to point routing algorithm has been presented. Implementation of ASCEND/DESCEND algorithms in O(m) time has been discussed. It has been shown that in case of a single node failure, the diameter increases by at most 6. 相似文献

8.

Fast evaluation of radial basis functions: I

R. K. Beatson

G. N. Newsam 《Computers & Mathematics with Applications》1992,24(12):7-19

This paper describes some new techniques for the rapid evaluation and fitting of radial basic functions. The techniques are based on the hierarchical and multipole expansions recently introduced by several authors for the calculation of many-body potentials. Consider in particular the N term thin-plate spline, s(x) = Σ_j=1^N d_jφ(x−x_j), where φ(u) = |u|²log|u|, in 2-dimensions. The direct evaluation of s at a single extra point requires an extra O(N) operations. This paper shows that, with judicious use of series expansions, the incremental cost of evaluating s(x) to within precision ε, can be cut to O(1+|log ε|) operations. In particular, if A is the interpolation matrix, a_i,j = φ(x_i−x_j, the technique allows computation of the matrix-vector product Ad in O(N), rather than the previously required O(N²) operations, and using only O(N) storage. Fast, storage-efficient, computation of this matrix-vector product makes pre-conditioned conjugate-gradient methods very attractive as solvers of the interpolation equations, Ad = y, when N is large. 相似文献

9.

Moments of conjugacy classes of binary words

Wai-Fong Chuan 《Theoretical computer science》2004,310(1-3):273-285

For each nonempty binary word w=c₁c₂c_q, where c_i{0,1}, the nonnegative integer ∑_i=1^q (q+1−i)c_i is called the moment of w and is denoted by M(w). Let [w] denote the conjugacy class of w. Define M([w])={M(u): u[w]}, N(w)={M(u)−M(w): u[w]} and δ(w)=max{M(u)−M(v): u,v[w]}. Using these objects, we obtain equivalent conditions for a binary word to be an -word (respectively, a power of an -word). For instance, we prove that the following statements are equivalent for any binary word w with |w|2: (a) w is an -word, (b) δ(w)=|w|−1, (c) w is a cyclic balanced primitive word, (d) M([w]) is a set of |w| consecutive positive integers, (e) N(w) is a set of |w| consecutive integers and 0N(w), (f) w is primitive and [w]St. 相似文献

10.

Efficient enumeration of all minimal separators in a graph

Hong Shen Weifa Liang 《Theoretical computer science》1997,180(1-2):169-180

This paper presents an efficient algorithm for enumerating all minimal a-b separators separating given non-adjacent vertices a and b in an undirected connected simple graph G = (V, E), Our algorithm requires O(n³R_ab) time, which improves the known result of O(n⁴R_ab) time for solving this problem, where ¦V¦= n and R_ab is the number of minimal a-b separators. The algorithm can be generalized for enumerating all minimal A-B separators that separate non-adjacent vertex sets A, B < V, and it requires O(n²(n − n_A − n_b)R_AB) time in this case, where n_a = ¦A¦, n_B = ¦B¦ and r_AB is the number of all minimal A−B separators. Using the algorithm above as a routine, an efficient algorithm for enumerating all minimal separators of G separating G into at least two connected components is constructed. The algorithm runs in time O(n³R⁺_Σ + n⁴R_Σ), which improves the known result of O(n⁶R_Σ) time, where R_σ is the number of all minimal separators of G and R_ΣR⁺_Σ = ∑_1i, v_j) ER_{v_iv_j} n − 1)/2 − m)R_Σ. Efficient parallelization of these algorithms is also discussed. It is shown that the first algorithm requires at most O((n/log n)R_ab) time and the second one runs in time O((n/log n)R⁺_Σ+n log nR_Σ) on a CREW PRAM with O(n³) processors. 相似文献

11.

Optimal and nearly optimal algorithms for approximating polynomial zeros

V.Y. Pan 《Computers & Mathematics with Applications》1996,31(12):97-138

We substantially improve the known algorithms for approximating all the complex zeros of an n^th degree polynomial p(x). Our new algorithms save both Boolean and arithmetic sequential time, versus the previous best algorithms of Schönhage [1], Pan [2], and Neff and Reif [3]. In parallel (NC) implementation, we dramatically decrease the number of processors, versus the parallel algorithm of Neff [4], which was the only NC algorithm known for this problem so far. Specifically, under the simple normalization assumption that the variable x has been scaled so as to confine the zeros of p(x) to the unit disc x : |x| ≤ 1, our algorithms (which promise to be practically effective) approximate all the zeros of p(x) within the absolute error bound 2^−b, by using order of n arithmetic operations and order of (b + n)n² Boolean (bitwise) operations (in both cases up to within polylogarithmic factors). The algorithms allow their optimal (work preserving) NC parallelization, so that they can be implemented by using polylogarithmic time and the orders of n arithmetic processors or (b + n)n² Boolean processors. All the cited bounds on the computational complexity are within polylogarithmic factors from the optimum (in terms of n and b) under both arithmetic and Boolean models of computation (in the Boolean case, under the additional (realistic) assumption that n = O(b)). 相似文献

12.

Algorithms of discretization of algebraic spatial curves on homogeneous cubical grids

Wojciech Mokrzycki 《Computers & Graphics》1988,12(3-4):477-487

In this paper new methods of discretization (integer approximation) of algebraic spatial curves in the form of intersecting surfaces P(x, y, z) = 0 and Q(x, y, z) = 0 are analyzed.

The use of homogeneous cubical grids G(h³) to discretize a curve is the essence of the method. Two new algorithms of discretization (on 6-connected grid G_6c(h³) and 26-connected grid G₂₆(h³)) are presented based on the method above. Implementation of the algorithms for algebraic spatial curves is suggested. The elaborated algorithms are adjusted for application in computer graphics and numerical control of machine tools. 相似文献

13.

Resolvability and the upper dimension of graphs

G. Chartrand C. Poisson P. Zhang 《Computers & Mathematics with Applications》2000,39(12):19-28

For an ordered set W = {w₁, w₂,…, w_k} of vertices and a vertex v in a connected graph G, the (metric) representation of v with respect to W is the k-vector r(v | W) = (d(v, w₁), d(v, w₂),…, d(v, w_k)), where d(x, y) represents the distance between the vertices x and y. The set W is a resolving set for G if distinct vertices of G have distinct representations. A new sharp lower bound for the dimension of a graph G in terms of its maximum degree is presented.

A resolving set of minimum cardinality is a basis for G and the number of vertices in a basis is its (metric) dimension dim(G). A resolving set S of G is a minimal resolving set if no proper subset of S is a resolving set. The maximum cardinality of a minimal resolving set is the upper dimension dim⁺(G). The resolving number res(G) of a connected graph G is the minimum k such that every k-set W of vertices of G is also a resolving set of G. Then 1 ≤ dim(G) ≤ dim⁺(G) ≤ res(G) ≤ n − 1 for every nontrivial connected graph G of order n. It is shown that dim⁺(G) = res(G) = n − 1 if and only if G = K_n, while dim⁺(G) = res(G) = 2 if and only if G is a path of order at least 4 or an odd cycle.

The resolving numbers and upper dimensions of some well-known graphs are determined. It is shown that for every pair a, b of integers with 2 ≤ a ≤ b, there exists a connected graph G with dim(G) = dim⁺(G) = a and res(G) = b. Also, for every positive integer N, there exists a connected graph G with res(G) − dim⁺(G) ≥ N and dim⁺(G) − dim(G) ≥ N. 相似文献

14.

Nonlinear robust stabilization of multi-parameter families of systems

Bertina Ho-Mock-Qai 《Systems & Control Letters》2002,45(5):331-338

We show that given any family of asymptotically stabilizable LTI systems depending continuously on a parameter that lies in some subset [a₁,b₁]××[a_p,b_p] of , there exists a C⁰ time-varying state feedback law v(t,x) (resp. a C⁰ time-invariant feedback law v(x)) which robustly globally exponentially stabilizes (resp. which robustly stabilizes, not asymptotically) the family. Further, if these systems are obtained by linearizing some nonlinear systems, then v(t,x) locally exponentially stabilizes these nonlinear systems. Finally, v(t,x) globally exponentially stabilizes any time-varying system which switches “slowly enough” between the given LTI systems. 相似文献

15.

Distributed selectsort sorting algorithms on broadcast communication networks

Jau-Hsiung Huang Leonard Kleinrock 《Parallel Computing》1990,16(2-3):183-190

In this paper, a distributed selectsort algorithm and a parameterized selectsort algorithm are presented to be applied on distributed systems for cases when N P where N is the number of elements to be sorted and P is the number of processors in the system. The distributed system considered in this paper uses a broadcasting channel for communication between processors. We show that the number of messages required for the parameterized selectsort algorithm is independent of N and is of complexity O(P), which is optimal in a distributed system with P processors. Furthermore, the amount of communication required in terms of elements is N + O(P³) and the computation time complexity is O((N/P)lgN + P²lg(N/P)). Hence, when N P³, the computation time complexity is O((N/P)lgN), which is optimal using P processors. In addition, this parameterized algorithm provides us with a parameter K such that by choosing the value of K allows us to trade among processing requirement, memory requirement, and communication requirement. It is shown that this parameterized algorithm can reduce the communication requirements significantly while only slightly increasing the computation requirements. 相似文献

16.

Efficient algorithms for shortest distance queries on special classes of polygons

R. Sridhar K. Han N. Chandrasekharan 《Theoretical computer science》1995,140(2):291-300

The problem of finding a rectilinear minimum bend path (RMBP) between two designated points inside a rectilinear polygon has applications in robotics and motion planning. In this paper, we present efficient algorithms to solve the query version of the RMBP problem for special classes of rectilinear polygons given their visibility graphs. Specifically, we show that given an unweighted graph G = (V, E), with ¦V¦ = N and ¦E¦ = M, algorithms to preprocess G in linear space and time such that the shortest distance queries — queries asking for the distance between any pair of nodes in the graph — can be answered in constant time and space are presented in this paper. For the case of a chordal graph G, our algorithms give a distance which is at most one away from the actual shortest distance. When G is a K-chordal graph, our algorithm produces an exact shortest distance in O(K) time. We also present a non-trivial parallel implementation of the sequential preprocessing algorithm for the CREW-PRAM model which runs in O(log² N) time using O(N + M) processors. After the preprocessing, we can answer the queries in constant time using a single processor. 相似文献

17.

Parallel algorithms for solving the satisfaction problem of functional and multivalued data dependencies

Chao-Chih Yang Weicong Shen 《Data & Knowledge Engineering》1989,3(4):323-338

Parallel algorithms for solving the satisfaction problem of non-trivial functional and multivalued data dependencies (FDs and MVDs) in a relation of N tuples by M processors are developed in this paper. Algorithms performing, in a parallel manner, batch or interactive checking of these data dependencies are also discussed. The M processors are organized as a linear systolic array. The time complexities of the first two algorithms for solving the FD satisfaction problem under M N are both O(N), and that of Algorithm (3) or (4) for solving the FD or MVD satisfaction problem under N M is O(N²/M). The latter complexity reduced to O(N) if N = M and is at least not worse than O(N log N) if N = M (N/log N). 相似文献

18.

Discontinuous quasivariational-like inequalities

P. Cubiotti 《Computers & Mathematics with Applications》1995,29(12):9-12

In this note, we deal with the following problem: given X Rⁿ, a multification gG : X → 2^X, two (single-valued) maps f : X → Rⁿ, η : X × X → Rⁿ, find a point x* X such that x* Γ (x*) and f(x*), η(x,x*) ≥ 0 for all x Γ(x*). We prove an existence theorem in which, in particular, the function f is not supposed to be continuous. 相似文献

19.

A simple solution to the l optimization problem

Mark A Mendlovitz 《Systems & Control Letters》1989,12(5):461-463

It is pointed out in this brief paper that the l¹ optimization problem min_{Q ε l_qp¹} | H − U * Q * V |₁, H ε l_mn¹, U ε l_mq¹, V ε l_pn¹ can be solved in one step rather than two. The solution of the dual problem is obviated by the direct solution of the primal problem via linear programming. The method here is applicable to finite-dimensional problems or approximating finite-dimensional problems, in the general case. 相似文献

20.

Fast adaptive PNN-based thresholding algorithms

Kuo-Liang Wan-Yu 《Pattern recognition》2003,36(12):2793-2804

Thresholding is a fundamental operation in image processing. Based on the pairwise nearest neighbor technique and the variance criterion, this theme presents two fast adaptive thresholding algorithms. The proposed first algorithm takes O((m−k)mτ) time where k denotes the number of thresholds specified by the user; m denotes the size of the compact image histogram, and the parameter τ has the constraint 1τm. On a set of different real images, experimental results reveal that the proposed first algorithm is faster than the previous three algorithms considerably while having a good feature-preserving capability. The previous three mentioned algorithms need O(m^k) time. Given a specific peak-signal-to-noise ratio (PSNR), we further present the second thresholding algorithm to determine the number of thresholds as few as possible in order to obtain a thresholded image satisfying the given PSNR. The proposed second algorithm takes O((m−k)mτ+γN) time where N and γ denote the image size and the fewest number of thresholds required, respectively. Some experiments are carried out to demonstrate the thresholded images that are encouraging. Since the time complexities required in our proposed two thresholding algorithms are polynomial, they could meet the real-time demand in image preprocessing. 相似文献