首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 22 毫秒
1.
An algorithm for reconstructing a binary array of size N sx N from its forest of quadtree representation is presented. The algorithm traverses each tree of the forest in preorder and maps each ‘black’ node into the spatial domain. The time complexity in mapping is O(log N × Bn + Bp), where Bn is the number of black nodes in the forest and Bp is the number of black pixels in the N × N array. The algorithm has been implemented on an Apple II.  相似文献   

2.
Parallel clustering algorithms   总被引:3,自引:0,他引:3  
Clustering techniques play an important role in exploratory pattern analysis, unsupervised learning and image segmentation applications. Many clustering algorithms, both partitional clustering and hierarchical clustering, require intensive computation, even for a modest number of patterns. This paper presents two parallel clustering algorithms. For a clustering problem with N = 2n patterns and M = 2m features, the time complexity of the traditional partitional clustering algorithm on a single processor computer is O(MNK), where K is the number of clusters. The proposed algorithm on anSIMD computer with MN processors has a time complexity O(K(n + m)). The time complexity of the proposed single-link hierarchical clustering algorithm is reduced from O(MN2) of the uniprocessor algorithm to O(nN) with MN processors.  相似文献   

3.
This paper makes an improvement of computing two nearest-neighbor problems of images on a reconfigurable array of processors (RAP) by increasing the bus width between processors. Based on a base-n system, a constant time algorithm is first presented for computing the maximum/minimum of N log N-bit unsigned integers on a RAP using N processors each with N1/c-bit bus width, where c is a constant and c ≥ 1. Then, two basic operations such as image component labeling and border following are also derived from it. Finally, these algorithms are used to design two constant time algorithms for the nearest neighbor black pixel and the nearest neighbor component problems on an N1/2 × N1/2 image using N1/2 × N1/2 processors each with N1/c-bit bus width, where c is a constant and c ≥ 1. Another contribution of this paper is that the execution time of the proposed algorithms is tunable by the bus width.  相似文献   

4.
A heap structure designed for secondary storage is suggested that tries to make the best use of the available buffer space in primary memory. The heap is a complete multi-way tree, with multi-page blocks of records as nodes, satisfying a generalized heap property. A special feature of the tree is that the nodes may be partially filled, as in B-trees. The structure is complemented with priority-queue operations insert and delete-max. When handling a sequence of S operations, the number of page transfers performed is shown to be O(∑i = 1S(1/P) log(M/P)(Ni/P)), where P denotes the number of records fitting into a page, M the capacity of the buffer space in records, and Ni, the number of records in the heap prior to the ith operation (assuming P 1 and S> M c · P, where c is a small positive constant). The number of comparisons required when handling the sequence is O(∑i = 1S log2 Ni). Using the suggested data structure we obtain an optimal external heapsort that performs O((N/P) log(M/P)(N/P)) page transfers and O(N log2 N) comparisons in the worst case when sorting N records.  相似文献   

5.
An efficient parallel algorithm for merging two sorted lists is presented. The algorithm is based on a novel partitioning algorithm that splits the two lists among the processors, in a way that ensures load balance during the merge. The partitioning algorithm can itself be efficiently parallelized, allowing the solution to scale with increased numbers of processors. A shared memory multiprocessor is assumed. The time complexity for partitioning and merging is O(N/p + log N), where p is the number of processors and N is the total number of elements in the two lists. Implementation results on a twenty node Sequent Symmetry multiprocessor are also presented.  相似文献   

6.
In this paper, a distributed selectsort algorithm and a parameterized selectsort algorithm are presented to be applied on distributed systems for cases when N P where N is the number of elements to be sorted and P is the number of processors in the system. The distributed system considered in this paper uses a broadcasting channel for communication between processors. We show that the number of messages required for the parameterized selectsort algorithm is independent of N and is of complexity O(P), which is optimal in a distributed system with P processors. Furthermore, the amount of communication required in terms of elements is N + O(P3) and the computation time complexity is O((N/P)lgN + P2lg(N/P)). Hence, when N P3, the computation time complexity is O((N/P)lgN), which is optimal using P processors. In addition, this parameterized algorithm provides us with a parameter K such that by choosing the value of K allows us to trade among processing requirement, memory requirement, and communication requirement. It is shown that this parameterized algorithm can reduce the communication requirements significantly while only slightly increasing the computation requirements.  相似文献   

7.
Based on the current fiber optic technology, a new computational model, called a linear array with a reconfigurable pipelined abus system (LARPBS), is proposed in this paper. A parallel quicksort algorithm is implemented on the model, and its time complexity is analyzed. For a set of N numbers, the quicksort algorithm reported in this paper runs in O(log2 N) average time on a linear array with a reconfigurable pipelined bus system of size N. If the number of processors available is reduced to P, where P < N, the algorithm runs in O((N/P) log2 N) average time and is still scalable. Besides proposing a new algorithm on the model, some basic data movement operations involved in the algorithm are discussed. We believe that these operations can be used to design other parallel algorithms on the same model. Future research in this area is also identified in this paper.  相似文献   

8.
In this paper, model sets for linear time-invariant systems spanned by fixed pole orthonormal bases are investigated. The obtained model sets are shown to be complete in Lp(T) (1<p<∞), the Lebesque spaces of functions on the unit circle T, and in C(T), the space of periodic continuous functions on T. The Lp norm error bounds for estimating systems in Lp(T) by the partial sums of the Fourier series formed by the orthonormal functions are computed for the case 1<p<∞. Some inequalities on the mean growth of the Fourier series are also derived. These results have application in estimation and model reduction.  相似文献   

9.
This paper presents an efficient algorithm for enumerating all minimal a-b separators separating given non-adjacent vertices a and b in an undirected connected simple graph G = (V, E), Our algorithm requires O(n3Rab) time, which improves the known result of O(n4Rab) time for solving this problem, where ¦V¦= n and Rab is the number of minimal a-b separators. The algorithm can be generalized for enumerating all minimal A-B separators that separate non-adjacent vertex sets A, B < V, and it requires O(n2(nnAnb)RAB) time in this case, where na = ¦A¦, nB = ¦B¦ and rAB is the number of all minimal AB separators. Using the algorithm above as a routine, an efficient algorithm for enumerating all minimal separators of G separating G into at least two connected components is constructed. The algorithm runs in time O(n3R+Σ + n4RΣ), which improves the known result of O(n6RΣ) time, where Rσ is the number of all minimal separators of G and RΣR+Σ = ∑1i, vj) ERvivj n − 1)/2 − m)RΣ. Efficient parallelization of these algorithms is also discussed. It is shown that the first algorithm requires at most O((n/log n)Rab) time and the second one runs in time O((n/log n)R+Σ+n log nRΣ) on a CREW PRAM with O(n3) processors.  相似文献   

10.
This paper describes some new techniques for the rapid evaluation and fitting of radial basic functions. The techniques are based on the hierarchical and multipole expansions recently introduced by several authors for the calculation of many-body potentials. Consider in particular the N term thin-plate spline, s(x) = Σj=1N djφ(xxj), where φ(u) = |u|2log|u|, in 2-dimensions. The direct evaluation of s at a single extra point requires an extra O(N) operations. This paper shows that, with judicious use of series expansions, the incremental cost of evaluating s(x) to within precision ε, can be cut to O(1+|log ε|) operations. In particular, if A is the interpolation matrix, ai,j = φ(xixj, the technique allows computation of the matrix-vector product Ad in O(N), rather than the previously required O(N2) operations, and using only O(N) storage. Fast, storage-efficient, computation of this matrix-vector product makes pre-conditioned conjugate-gradient methods very attractive as solvers of the interpolation equations, Ad = y, when N is large.  相似文献   

11.
Stphane 《Pattern recognition》1995,28(12):1993-2000
We propose a parallel thinning algorithm for binary pictures. Given an N × N binary image including an object, our algorithm computes in O(N2) the skeleton of the object, using a pyramidal decomposition of the picture. The behavior of this algorithm is studied considering a family of digitalization of the same object at a different level of resolution. With the Exclusive Read Exclusive Write (EREW) Parallel Random Access Machine (PRAM), our algorithm runs in O(log N) time using O(N2/logN) processors and it is work-optimal. The same result is obtained with high-connectivity distributed memory SIMD machines having strong hypercube and pyramid. We describe the basic operator, the pyramidal algorithm and some experimental results on the SIMD MasPar parallel machine.  相似文献   

12.
Nested dissection is a very popular direct method for solving sparse linear systems that arise from finite difference and finite element methods. Worley and Schreiber [16] give a fine grain algorithm for a square array of processors. Their algorithm uses O(N2) processors, each with O(N) memory, to factor an N2 by N2 sparse matrix whose graphs is an N × N mesh. The efficiency of their method is between 1/46 and 1/12. George et al. [6] [8] give a medium grain algorithm for hypercube architecture, while George et al. [7] give an algorithm for shared memory machines. These papers present a column oriented approach which can exploit O(N) parallelism and yield efficiencies up to 50%. Lucas [11] also gives a column oriented scheme which achieves up to 75% efficiency and O(N) parallelism. In this paper, we present a medium to fine grain algorithm for a P × P array of processors with local memory. This algorithm can exploit up to O(N2) parallelism. The efficiency of the fine grain version is comparable to [16] while as a medium grain algorithm achieves about 49% efficiency. The strength of the method is due to three factors: its ability to pipeline much of the computation, overlapping computation and communication, and the use of level 3 BLAS like primitives. In addition to its high efficiency its memory requirement is optimal, only O(N2 log N/P2) words memory is needed per processor.  相似文献   

13.
This paper presents a sum-of-product neural network (SOPNN) structure. The SOPNN can learn to implement static mapping that multilayer neural networks and radial basis function networks normally perform. The output of the neural network has the sum-of-product form ∑Npi=1Nvj=1 fij (xj), where xj's are inputs, Nv is the number of inputs, fij( ) is a function generated through network training, and Np is the number of product terms. The function fij(xj) can be expressed as ∑kwijkBjk(xj), where Bjk( ) is a single-variable basis function and Wijk's are weight values. Linear memory arrays can be used to store the weights. If Bjk( ) is a Gaussian function, the new neural network degenerates to a Gaussian function network. This paper focuses on the use of overlapped rectangular pulses as the basis functions. With such basis functions, WijkBjk(xj) will equal either zero or Wijk, and the computation of fij(xj) becomes a simple addition of some retrieved Wijk's. The structure can be viewed as a basis function network with a flexible form for the basis functions. Learning can start with a small set of submodules and have new submodules added when it becomes necessary. The new neural network structure demonstrates excellent learning convergence characteristics and requires small memory space. It has merits over multilayer neural networks, radial basis function networks and CMAC in function approximation and mapping in high-dimensional input space. The technique has been tested for function approximation, prediction of a time series, learning control, and classification.  相似文献   

14.
The determination of the rotational symmetry of a shape is useful for object recognition and shape analysis in computer vision applications. In this paper, a simple, but effective, algorithm to analyse the rotational symmetry of a given closed-curve shape S is proposed. A circle C with the centroid of S as the circle center and the average radius of S as the circle radius is superimposed on S, resulting in the intersection of C and S at a set of points. By theoretical analysis, the relationship between the order Ns of the rotational symmetry of S and the number of intersection points between C and S is established. All the possible values for Ns are determined. And finally Ns is determined by evaluating the similarity between S and its rotated versions. In the proposed algorithm, only simple pixel operations and second-order moment function computation are involved. Several problems caused by the use of discrete image coordinates are analysed and solved for correct decision-making in the algorithm. Good experimental results are also included.  相似文献   

15.
Hypercube networks offer a feasible cost-effective solution to parallel computing. Here, a large number of low-cost processors with their own local memories are connected to form an n-cube (Bn) or one of its variants; and the inter-processor communication takes place by message passing instead of shared variables. This paper addresses a constrained two-terminal reliability measure referred to as distance reliability (DR) as it considers the probability that a message can be delivered in optimal time from a given node s to a node t. The problem is equivalent to that of having an operational optimal path (not just any path) between the two nodes. In Bn, the Hamming distance between labels of s and t or H(s, t) determines the length of the optimal path between the two nodes. The shortest distance restriction guarantees optimal communication delay between processors and high link/node utilization across the network. Moreover, it provides a measure for the robustness of symmetric networks. In particular, when H(s, t) = n in Bn, DR will yield the probability of degradation in the diameter, a concept which directly relates to fault-diameter. The paper proposes two schemes to evaluate DR in Bn. The first scheme uses a combinatorial approach by limiting the number of faulty components to (2H(s, t) − 2), while the second outlines paths of length H(s, t) and, then generates a recursive closed-form solution to compute DR. The theoretical results have been verified by simulation. The discrepancy between the theoretical and simulation results is in most cases below 1% and in the worst case 4.6%.  相似文献   

16.
This paper describes several parallel algorithms for image edge relaxation on array processors with different numbers of processing elements (PEs) connected by a mesh or hypercube network. The time complexity of Prager's original edge relaxation scheme is O(N2) per iteration using floating-point operations on a sequential machine, where N2 is the number of pixels in the image. Modifications to the scheme are made so that no multiplications are employed and only integer operations are required. Moreover, with parallel processing, the time complexity per iteration is reduced to some constant value. A time complexity analysis on two parallel algorithms is performed. Although the algorithm on an array processor with 4N2 PEs achieved higher degree of parallelism, the algorithm with N2 PEs is preferred. Further modifications on the latter algorithm are made to accommodate to fewer PEs.  相似文献   

17.
Parallel algorithms for solving the satisfaction problem of non-trivial functional and multivalued data dependencies (FDs and MVDs) in a relation of N tuples by M processors are developed in this paper. Algorithms performing, in a parallel manner, batch or interactive checking of these data dependencies are also discussed. The M processors are organized as a linear systolic array. The time complexities of the first two algorithms for solving the FD satisfaction problem under M N are both O(N), and that of Algorithm (3) or (4) for solving the FD or MVD satisfaction problem under N M is O(N2/M). The latter complexity reduced to O(N) if N = M and is at least not worse than O(N log N) if N = M (N/log N).  相似文献   

18.
The parallel stratagem in this paper uses scattered square decomposition, introduced by G. Fox, for its data assignment and then exploits parallelism in the solution steps of the sequential Householder tridiagonalization algorithm. One may condense a real symmetric full matrix A of order n into a tridiagonal form by the stratagem in concurrent machines where N(= D2) processors are used. Expressions for efficiency and speedup are given for the evaluation of the stratagem. An alternative stratagem which requires less data transmission but more computations is also discussed. The results shown that the Householder Method of tridiagonalization may be implemented on a concurrent machine efficiently by scattered square decomposition provided that the number of matrix elements contained in each processor is much larger than the number of processors of the concurrent machine, and the ratio of the time to transmit one data item from one processor to any other processor to the time to perform a floating-point arithmetic operation is small enough.  相似文献   

19.
The paper presents parallel algorithms for solving Poisson equation at N2 mesh points. The methods based on marching techniques are structured for efficient parallel realization. Using orthogonal decomposition properties of arising matrices, the algorithms can be formulated in terms of transformed vectors. On a MIMD computer with not more than N processors, the computations can be performed in horizontal slices with minimal synchronization requirements. Considering an SIMD machine with N2 processors, the complexity bound O(log N) has been achieved, whereby the single marching requires 10 log N steps only.  相似文献   

20.
A parallel two-list algorithm for the knapsack problem   总被引:10,自引:0,他引:10  
An n-element knapsack problem has 2n possible solutions to search over, so a task which can be accomplished in 2″ trials if an exhaustive search is used. Due to the exponential time in solving the knapsack problem, the problem is considered to be very hard. In the past decade, much effort has been done in order to find techniques which could lead to practical algorithms with reasonable running time. In 1994, Chang et al. proposed a brilliant parallel algorithm, which needs O(2n/8) processors to solve the knapsack problem in O(2n/2) time; that is, the cost of Chang et al.'s parallel algorithm is O(25n/8). In this paper, we propose a parallel algorithm to improve Chang et al.'s parallel algorithm by reducing the time complexity to be O(23n/8) under the same O(2n/8) processors available. Thus, the proposed parallel algorithm has a cost of O(2n/2). It is an improvement over previous literature. We believe that the proposed parallel algorithm is pragmatically feasible at the moment when multiprocessor systems become more and more popular.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号