期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Algorithm for converting a forest of quadtrees to a binary array

S Ravindran M Manohar 《Image and vision computing》1987,5(4):297-300

An algorithm for reconstructing a binary array of size N sx N from its forest of quadtree representation is presented. The algorithm traverses each tree of the forest in preorder and maps each ‘black’ node into the spatial domain. The time complexity in mapping is O(log N × B_n + B_p), where B_n is the number of black nodes in the forest and B_p is the number of black pixels in the N × N array. The algorithm has been implemented on an Apple II. 相似文献

2.

Parallel clustering algorithms 总被引：3，自引：0，他引：3

Xiaobo Li Zhixi Fang 《Parallel Computing》1989,11(3):275-290

Clustering techniques play an important role in exploratory pattern analysis, unsupervised learning and image segmentation applications. Many clustering algorithms, both partitional clustering and hierarchical clustering, require intensive computation, even for a modest number of patterns. This paper presents two parallel clustering algorithms. For a clustering problem with N = 2ⁿ patterns and M = 2^m features, the time complexity of the traditional partitional clustering algorithm on a single processor computer is O(MNK), where K is the number of clusters. The proposed algorithm on anSIMD computer with MN processors has a time complexity O(K(n + m)). The time complexity of the proposed single-link hierarchical clustering algorithm is reduced from O(MN²) of the uniprocessor algorithm to O(nN) with MN processors. 相似文献

3.

Efficient algorithms for computing two nearest-neighbor problems on a rap

Tzong-Wann Kao Shi-Jinn Horng 《Pattern recognition》1994,27(12):1707-1716

This paper makes an improvement of computing two nearest-neighbor problems of images on a reconfigurable array of processors (RAP) by increasing the bus width between processors. Based on a base-n system, a constant time algorithm is first presented for computing the maximum/minimum of N log N-bit unsigned integers on a RAP using N processors each with N^1/c-bit bus width, where c is a constant and c ≥ 1. Then, two basic operations such as image component labeling and border following are also derived from it. Finally, these algorithms are used to design two constant time algorithms for the nearest neighbor black pixel and the nearest neighbor component problems on an N^1/2 × N^1/2 image using N^1/2 × N^1/2 processors each with N^1/c-bit bus width, where c is a constant and c ≥ 1. Another contribution of this paper is that the execution time of the proposed algorithms is tunable by the bus width. 相似文献

4.

Heaps and heapsort on secondary storage

R. Fadel K. V. Jakobsen J. Katajainen J. Teuhola 《Theoretical computer science》1999,220(2):585-362

A heap structure designed for secondary storage is suggested that tries to make the best use of the available buffer space in primary memory. The heap is a complete multi-way tree, with multi-page blocks of records as nodes, satisfying a generalized heap property. A special feature of the tree is that the nodes may be partially filled, as in B-trees. The structure is complemented with priority-queue operations insert and delete-max. When handling a sequence of S operations, the number of page transfers performed is shown to be O(∑_{i = 1}^S(1/P) log_(M/P)(N_i/P)), where P denotes the number of records fitting into a page, M the capacity of the buffer space in records, and N_i, the number of records in the heap prior to the ith operation (assuming P 1 and S> M c · P, where c is a small positive constant). The number of comparisons required when handling the sequence is O(∑_{i = 1}^S log₂ N_i). Using the suggested data structure we obtain an optimal external heapsort that performs O((N/P) log_(M/P)(N/P)) page transfers and O(N log₂ N) comparisons in the worst case when sorting N records. 相似文献

5.

Parallel merging: algorithm and implementation results

Peter J. Varman Balakrishna R. Iyer Donald J. Haderle Stephen M. Dunn 《Parallel Computing》1990,15(1-3):165-177

An efficient parallel algorithm for merging two sorted lists is presented. The algorithm is based on a novel partitioning algorithm that splits the two lists among the processors, in a way that ensures load balance during the merge. The partitioning algorithm can itself be efficiently parallelized, allowing the solution to scale with increased numbers of processors. A shared memory multiprocessor is assumed. The time complexity for partitioning and merging is O(N/p + log N), where p is the number of processors and N is the total number of elements in the two lists. Implementation results on a twenty node Sequent Symmetry multiprocessor are also presented. 相似文献

6.

Distributed selectsort sorting algorithms on broadcast communication networks

Jau-Hsiung Huang Leonard Kleinrock 《Parallel Computing》1990,16(2-3):183-190

In this paper, a distributed selectsort algorithm and a parameterized selectsort algorithm are presented to be applied on distributed systems for cases when N P where N is the number of elements to be sorted and P is the number of processors in the system. The distributed system considered in this paper uses a broadcasting channel for communication between processors. We show that the number of messages required for the parameterized selectsort algorithm is independent of N and is of complexity O(P), which is optimal in a distributed system with P processors. Furthermore, the amount of communication required in terms of elements is N + O(P³) and the computation time complexity is O((N/P)lgN + P²lg(N/P)). Hence, when N P³, the computation time complexity is O((N/P)lgN), which is optimal using P processors. In addition, this parameterized algorithm provides us with a parameter K such that by choosing the value of K allows us to trade among processing requirement, memory requirement, and communication requirement. It is shown that this parameterized algorithm can reduce the communication requirements significantly while only slightly increasing the computation requirements. 相似文献

7.

Efficient and scalable quicksort on a linear array with a reconfigurable pipelined bus system 总被引：3，自引：0，他引：3

Yi Pan Mounir Hamdi Keqin Li 《Future Generation Computer Systems》1998,13(6):501-513

Based on the current fiber optic technology, a new computational model, called a linear array with a reconfigurable pipelined abus system (LARPBS), is proposed in this paper. A parallel quicksort algorithm is implemented on the model, and its time complexity is analyzed. For a set of N numbers, the quicksort algorithm reported in this paper runs in O(log₂ N) average time on a linear array with a reconfigurable pipelined bus system of size N. If the number of processors available is reduced to P, where P < N, the algorithm runs in O((N/P) log₂ N) average time and is still scalable. Besides proposing a new algorithm on the model, some basic data movement operations involved in the algorithm are discussed. We believe that these operations can be used to design other parallel algorithms on the same model. Future research in this area is also identified in this paper. 相似文献

8.

Discrete-time system modelling in L_p with orthonormal basis functions

Hüseyin Akay 《Systems & Control Letters》2000,39(5):581-376

In this paper, model sets for linear time-invariant systems spanned by fixed pole orthonormal bases are investigated. The obtained model sets are shown to be complete in L_p(T) (1<p<∞), the Lebesque spaces of functions on the unit circle T, and in C(T), the space of periodic continuous functions on T. The L_p norm error bounds for estimating systems in L_p(T) by the partial sums of the Fourier series formed by the orthonormal functions are computed for the case 1<p<∞. Some inequalities on the mean growth of the Fourier series are also derived. These results have application in estimation and model reduction. 相似文献

9.

Efficient enumeration of all minimal separators in a graph

Hong Shen Weifa Liang 《Theoretical computer science》1997,180(1-2):169-180

This paper presents an efficient algorithm for enumerating all minimal a-b separators separating given non-adjacent vertices a and b in an undirected connected simple graph G = (V, E), Our algorithm requires O(n³R_ab) time, which improves the known result of O(n⁴R_ab) time for solving this problem, where ¦V¦= n and R_ab is the number of minimal a-b separators. The algorithm can be generalized for enumerating all minimal A-B separators that separate non-adjacent vertex sets A, B < V, and it requires O(n²(n − n_A − n_b)R_AB) time in this case, where n_a = ¦A¦, n_B = ¦B¦ and r_AB is the number of all minimal A−B separators. Using the algorithm above as a routine, an efficient algorithm for enumerating all minimal separators of G separating G into at least two connected components is constructed. The algorithm runs in time O(n³R⁺_Σ + n⁴R_Σ), which improves the known result of O(n⁶R_Σ) time, where R_σ is the number of all minimal separators of G and R_ΣR⁺_Σ = ∑_1i, v_j) ER_{v_iv_j} n − 1)/2 − m)R_Σ. Efficient parallelization of these algorithms is also discussed. It is shown that the first algorithm requires at most O((n/log n)R_ab) time and the second one runs in time O((n/log n)R⁺_Σ+n log nR_Σ) on a CREW PRAM with O(n³) processors. 相似文献

10.

Fast evaluation of radial basis functions: I

R. K. Beatson

G. N. Newsam 《Computers & Mathematics with Applications》1992,24(12):7-19

This paper describes some new techniques for the rapid evaluation and fitting of radial basic functions. The techniques are based on the hierarchical and multipole expansions recently introduced by several authors for the calculation of many-body potentials. Consider in particular the N term thin-plate spline, s(x) = Σ_j=1^N d_jφ(x−x_j), where φ(u) = |u|²log|u|, in 2-dimensions. The direct evaluation of s at a single extra point requires an extra O(N) operations. This paper shows that, with judicious use of series expansions, the incremental cost of evaluating s(x) to within precision ε, can be cut to O(1+|log ε|) operations. In particular, if A is the interpolation matrix, a_i,j = φ(x_i−x_j, the technique allows computation of the matrix-vector product Ad in O(N), rather than the previously required O(N²) operations, and using only O(N) storage. Fast, storage-efficient, computation of this matrix-vector product makes pre-conditioned conjugate-gradient methods very attractive as solvers of the interpolation equations, Ad = y, when N is large. 相似文献

11.

Pyramidal thinning algorithm for SIMD parallel machines

Stphane 《Pattern recognition》1995,28(12):1993-2000

We propose a parallel thinning algorithm for binary pictures. Given an N × N binary image including an object, our algorithm computes in O(N²) the skeleton of the object, using a pyramidal decomposition of the picture. The behavior of this algorithm is studied considering a family of digitalization of the same object at a different level of resolution. With the Exclusive Read Exclusive Write (EREW) Parallel Random Access Machine (PRAM), our algorithm runs in O(log N) time using O(N²/logN) processors and it is work-optimal. The same result is obtained with high-connectivity distributed memory SIMD machines having strong hypercube and pyramid. We describe the basic operator, the pyramidal algorithm and some experimental results on the SIMD MasPar parallel machine. 相似文献

12.

Parallel nested dissection

John M. Conroy 《Parallel Computing》1990,16(2-3):139-156

Nested dissection is a very popular direct method for solving sparse linear systems that arise from finite difference and finite element methods. Worley and Schreiber [16] give a fine grain algorithm for a square array of processors. Their algorithm uses O(N²) processors, each with O(N) memory, to factor an N² by N² sparse matrix whose graphs is an N × N mesh. The efficiency of their method is between 1/46 and 1/12. George et al. [6] [8] give a medium grain algorithm for hypercube architecture, while George et al. [7] give an algorithm for shared memory machines. These papers present a column oriented approach which can exploit O(N) parallelism and yield efficiencies up to 50%. Lucas [11] also gives a column oriented scheme which achieves up to 75% efficiency and O(N) parallelism. In this paper, we present a medium to fine grain algorithm for a P × P array of processors with local memory. This algorithm can exploit up to O(N²) parallelism. The efficiency of the fine grain version is comparable to [16] while as a medium grain algorithm achieves about 49% efficiency. The strength of the method is due to three factors: its ability to pipeline much of the computation, overlapping computation and communication, and the use of level 3 BLAS like primitives. In addition to its high efficiency its memory requirement is optimal, only O(N² log N/P²) words memory is needed per processor. 相似文献

13.

A sum-of-product neural network (SOPNN)

Chun-Shin Reference to Lin Chien-Kuo Reference to Li 《Neurocomputing》2000,30(1-4):273-291

This paper presents a sum-of-product neural network (SOPNN) structure. The SOPNN can learn to implement static mapping that multilayer neural networks and radial basis function networks normally perform. The output of the neural network has the sum-of-product form ∑^Np_i=1∏^N_v_j=1 f_ij (x_j), where x_j's are inputs, N_v is the number of inputs, f_ij( ) is a function generated through network training, and N_p is the number of product terms. The function f_ij(x_j) can be expressed as ∑_kw_ijkB_jk(x_j), where B_jk( ) is a single-variable basis function and W_ijk's are weight values. Linear memory arrays can be used to store the weights. If B_jk( ) is a Gaussian function, the new neural network degenerates to a Gaussian function network. This paper focuses on the use of overlapped rectangular pulses as the basis functions. With such basis functions, W_ijkB_jk(x_j) will equal either zero or W_ijk, and the computation of f_ij(x_j) becomes a simple addition of some retrieved W_ijk's. The structure can be viewed as a basis function network with a flexible form for the basis functions. Learning can start with a small set of submodules and have new submodules added when it becomes necessary. The new neural network structure demonstrates excellent learning convergence characteristics and requires small memory space. It has merits over multilayer neural networks, radial basis function networks and CMAC in function approximation and mapping in high-dimensional input space. The technique has been tested for function approximation, prediction of a time series, learning control, and classification. 相似文献

14.

Automatic rotational symmetry determination for shape analysis

Jin-Jang Leou Wen-Hsiang Tsai 《Pattern recognition》1987,20(6):571-582

The determination of the rotational symmetry of a shape is useful for object recognition and shape analysis in computer vision applications. In this paper, a simple, but effective, algorithm to analyse the rotational symmetry of a given closed-curve shape S is proposed. A circle C with the centroid of S as the circle center and the average radius of S as the circle radius is superimposed on S, resulting in the intersection of C and S at a set of points. By theoretical analysis, the relationship between the order N_s of the rotational symmetry of S and the number of intersection points between C and S is established. All the possible values for N_s are determined. And finally N_s is determined by evaluating the similarity between S and its rotated versions. In the proposed algorithm, only simple pixel operations and second-order moment function computation are involved. Several problems caused by the use of discrete image coordinates are analysed and solved for correct decision-making in the algorithm. Good experimental results are also included. 相似文献

15.

A robustness measure for hypercube networks

Shahram Latifi Suresh Rai 《Computers & Electrical Engineering》1994,20(6):445-458

Hypercube networks offer a feasible cost-effective solution to parallel computing. Here, a large number of low-cost processors with their own local memories are connected to form an n-cube (B_n) or one of its variants; and the inter-processor communication takes place by message passing instead of shared variables. This paper addresses a constrained two-terminal reliability measure referred to as distance reliability (DR) as it considers the probability that a message can be delivered in optimal time from a given node s to a node t. The problem is equivalent to that of having an operational optimal path (not just any path) between the two nodes. In B_n, the Hamming distance between labels of s and t or H(s, t) determines the length of the optimal path between the two nodes. The shortest distance restriction guarantees optimal communication delay between processors and high link/node utilization across the network. Moreover, it provides a measure for the robustness of symmetric networks. In particular, when H(s, t) = n in B_n, DR will yield the probability of degradation in the diameter, a concept which directly relates to fault-diameter. The paper proposes two schemes to evaluate DR in B_n. The first scheme uses a combinatorial approach by limiting the number of faulty components to (2H(s, t) − 2), while the second outlines paths of length H(s, t) and, then generates a recursive closed-form solution to compute DR. The theoretical results have been verified by simulation. The discrepancy between the theoretical and simulation results is in most cases below 1% and in the worst case 4.6%. 相似文献

16.

Parallel processing approaches to edge relaxation

Eva Leung Xiaobo Li 《Pattern recognition》1988,21(6):547-558

This paper describes several parallel algorithms for image edge relaxation on array processors with different numbers of processing elements (PEs) connected by a mesh or hypercube network. The time complexity of Prager's original edge relaxation scheme is O(N²) per iteration using floating-point operations on a sequential machine, where N² is the number of pixels in the image. Modifications to the scheme are made so that no multiplications are employed and only integer operations are required. Moreover, with parallel processing, the time complexity per iteration is reduced to some constant value. A time complexity analysis on two parallel algorithms is performed. Although the algorithm on an array processor with 4N² PEs achieved higher degree of parallelism, the algorithm with N² PEs is preferred. Further modifications on the latter algorithm are made to accommodate to fewer PEs. 相似文献

17.

Parallel algorithms for solving the satisfaction problem of functional and multivalued data dependencies

Chao-Chih Yang Weicong Shen 《Data & Knowledge Engineering》1989,3(4):323-338

Parallel algorithms for solving the satisfaction problem of non-trivial functional and multivalued data dependencies (FDs and MVDs) in a relation of N tuples by M processors are developed in this paper. Algorithms performing, in a parallel manner, batch or interactive checking of these data dependencies are also discussed. The M processors are organized as a linear systolic array. The time complexities of the first two algorithms for solving the FD satisfaction problem under M N are both O(N), and that of Algorithm (3) or (4) for solving the FD or MVD satisfaction problem under N M is O(N²/M). The latter complexity reduced to O(N) if N = M and is at least not worse than O(N log N) if N = M (N/log N). 相似文献

18.

A parallel Householder tridiagonalization stratagem using scattered square decomposition

H. Y. Chang

S. Utku

M. SalamaD. Rapp 《Parallel Computing》1988,6(3):297-311

The parallel stratagem in this paper uses scattered square decomposition, introduced by G. Fox, for its data assignment and then exploits parallelism in the solution steps of the sequential Householder tridiagonalization algorithm. One may condense a real symmetric full matrix A of order n into a tridiagonal form by the stratagem in concurrent machines where N(= D²) processors are used. Expressions for efficiency and speedup are given for the evaluation of the stratagem. An alternative stratagem which requires less data transmission but more computations is also discussed. The results shown that the Householder Method of tridiagonalization may be implemented on a concurrent machine efficiently by scattered square decomposition provided that the number of matrix elements contained in each processor is much larger than the number of processors of the concurrent machine, and the ratio of the time to transmit one data item from one processor to any other processor to the time to perform a floating-point arithmetic operation is small enough. 相似文献

19.

Parallel marching Poisson solvers

Marian Vajter&#x;ic 《Parallel Computing》1984,1(3-4):325-330

The paper presents parallel algorithms for solving Poisson equation at N² mesh points. The methods based on marching techniques are structured for efficient parallel realization. Using orthogonal decomposition properties of arising matrices, the algorithms can be formulated in terms of transformed vectors. On a MIMD computer with not more than N processors, the computations can be performed in horizontal slices with minimal synchronization requirements. Considering an SIMD machine with N² processors, the complexity bound O(log N) has been achieved, whereby the single marching requires 10 log N steps only. 相似文献

20.

A parallel two-list algorithm for the knapsack problem 总被引：10，自引：0，他引：10

Der-Chyuan Lou Chin-Chen Chang 《Parallel Computing》1997,22(14):1985-1996

An n-element knapsack problem has 2ⁿ possible solutions to search over, so a task which can be accomplished in 2″ trials if an exhaustive search is used. Due to the exponential time in solving the knapsack problem, the problem is considered to be very hard. In the past decade, much effort has been done in order to find techniques which could lead to practical algorithms with reasonable running time. In 1994, Chang et al. proposed a brilliant parallel algorithm, which needs O(2^n/8) processors to solve the knapsack problem in O(2^n/2) time; that is, the cost of Chang et al.'s parallel algorithm is O(2^5n/8). In this paper, we propose a parallel algorithm to improve Chang et al.'s parallel algorithm by reducing the time complexity to be O(2^3n/8) under the same O(2^n/8) processors available. Thus, the proposed parallel algorithm has a cost of O(2^n/2). It is an improvement over previous literature. We believe that the proposed parallel algorithm is pragmatically feasible at the moment when multiprocessor systems become more and more popular. 相似文献