首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Thinning algorithms based on quadtree and octree representations   总被引:1,自引:0,他引:1  
Thinning is a critical pre-processing step to obtain skeletons for pattern analysis. Quadtree and octree are hierarchical data representations in image processing and computer graphics. In this paper, we present new 2-D area-based and 3-D surface-based thinning algorithms for directly converting quadtree and octree representations to skeletons. The computational complexity of our thinning algorithm for a 2-D or a 3-D image with each length N is respectively O(N2) or O(N3), which is more efficient than the existing algorithms of O(N3) or O(N4). Furthermore, our thinning algorithms can lessen boundary noise spurs and are suited for parallel implementation.  相似文献   

2.
We report an O(N) parallel tight binding molecular dynamics simulation study of (10×10) structured carbon nanotubes (CNT) at 300 K. We converted a sequential O(N3) TBMD simulation program into an O(N) parallel code, utilizing the concept of parallel virtual machines (PVM). The code is tested in a distributed memory system consisting of a cluster with 8 PC's that run under Linux (Slackware 2.2.13 kernel). Our results on the speed up, efficiency and system size are given.  相似文献   

3.
In this paper we give a very space-efficient, yet fast method for estimating the fractal dimensionality of the points in a data stream. Algorithms to estimate the fractal dimension exist, such as the straightforward quadratic algorithm and the faster O(NlogN) or even O(N) box-counting algorithms. However, the sub-quadratic algorithms require Ω(N) space. In this paper, we propose an algorithm that computes the fractal dimension in a single pass, using a constant amount of memory relative to data cardinality. Experimental results on synthetic and real world data sets demonstrate the effectiveness of our algorithm.  相似文献   

4.
Discrete Fourier transform (DFT) is an important tool in digital signal processing. In the present paper, we propose a novel approach to performing DFT. We transform DFT into a form expressed in discrete moments via a modular mapping and truncating Taylor series expansion. From this, we extend the use of our systolic array for fast computation of moments without any multiplications to one that computes DFT with only a few multiplications and without any evaluations of exponential functions. The multiplication number used in our method isO(Nlog2 N/ log2log2 N) superior toO(Nlog2 N) in FFT. The execution time of the systolic array is onlyO(Nlog2 N/ log2log2 N) for 1-D DFT andO(Nk) fork-D DFT (k⩾2). The systolic implementation is a demonstration of the locality of dataflow in the algorithms and hence it implies an easy and potential hardware/VLSI realization. The approach is also applicable to DFT inverses.  相似文献   

5.
Maximal outerplanar graphs constitute an important class of graphs, often encountered in various applications, e.g., computational geometry, robotics, etc. In this paper, we propose a parallel algorithm for testing the isomorphism of maximal outerplanar graphs. Given the ordered adjacency lists of the two graphs, the proposed algorithm tests their isomorphism inO(log N) time usingNprocessors, for graphs withNnodes on an EREW shared memory model, as well as on a hypercube arhitecture. When the adjacency matrices of the graphs are given, this algorithm can be redesigned onN2processors to run inO(log N) time.  相似文献   

6.
目的 基于马尔可夫随机场(MRF)的变分光流计算是一种较为鲁棒的光流计算方法,但是计算效率很低。置信传播算法(BP) 是一种针对MRF较为高效的全局优化算法。本文提出一种MRF变分光流计算模型并采用并行BP方法实现,极大提高计算效率。方法 提出的MRF变分光流计算模型中的数据项采用了Horn等人根据灰度守恒假设得到的光流基本约束方程,并采用非平方惩罚函数进行调整以平滑边界影响。为在CUDA平台上实现高效并行处理,本文提出了一种优化的基于置信传播的MRF并行光流计算方法。该优化方法在采用置信传播最小化MRF光流能量函数时,采用了一种4层的3维网络结构进行并行计算,每层对应MRF4邻域模型中的一个方向的信息传播,同时在每层中为每个像素分配多个线程采用并行降维法计算所要传递的信息,大大降低单线程计算负荷,大幅度提高计算效率。结果 采用旋转小球图像序列进行实验,计算效率提高314倍;采用旋转小球、Yosemite山谷和RubberWhale 3种不同图像序列,与Horn算法、Weickert算法、Hossen并行Lucas算法、Grauer-Gray并行MRF算法进行对比实验,本文方法得到最低的平均端点误差(AEE),分别为0.13、0.55和0.34。结论 本文提出了一种新的MRF光流计算模型,并在CUDA平台上实现了并行优化计算。实验结果表明,本文提出的并行计算方法在保持计算精度的同时极大提高了计算效率。本文方法对内存需求巨大,在处理高分辨率图像时,限制了采样点数,难以计算大位移。  相似文献   

7.
We introduce a GPU-based parallel vertex substitution (pVS) algorithm for the p-median problem using the CUDA architecture by NVIDIA. pVS is developed based on the best profit search algorithm, an implementation of vertex substitution (VS), that is shown to produce reliable solutions for p-median problems. In our approach, each candidate solution in the entire search space is allocated to a separate thread, rather than dividing the search space into parallel subsets. This strategy maximizes the usage of GPU parallel architecture and results in a significant speedup and robust solution quality. Computationally, pVS reduces the worst case complexity from sequential VS’s O(p · n2) to O(p · (n ? p)) on each thread by parallelizing computational tasks on GPU implementation. We tested the performance of pVS on two sets of numerous test cases (including 40 network instances from OR-lib) and compared the results against a CPU-based sequential VS implementation. Our results show that pVS achieved a speed gain ranging from 10 to 57 times over the traditional VS in all test network instances.  相似文献   

8.
Fractal objects which, by definition, are objects that have scale-invariant shapes and fractional scaling dimensions (fractal dimension) with magnitudes related to the complexity of the objects, are ubiquitous in nature. In particular, many biological structures and systems have fractal properties and therefore may be well studied and modelled using fractal geometry. In the box counting algorithm, one of several different approaches to calculate the fractal dimension, one determines, for several values of, the number of boxesN() with side length needed to completely cover the studied object. IfN() andr are found to be related by the power law relationshipN(r) r –D, whereD is the scaling dimension, and ifD is a non-integer, then the object is fractal andD is the fractal dimension.In certain circumstances in which one may need to calculate the fractal dimension of a three-dimensional fractal from a two-dimensional projection (e.g. an X-ray), a new mathematical relationship may be utilized to obtain the actual dimensionD; it readsD=–log[1–(1–R 2– D p) R ]/logR + 3, whereD p is the fractal dimension of the two-dimensional planar projection of the object andR is the box size used in calculating the box-counting dimension.Fractal dimension calculations have been found to be particularly useful as quantitative indices of the degree of coronary vascularity and the degree of heart interbeat interval variability. Fractal growth models such as diffusion limited aggregation (DLA) can be used to model artery growth.To sum up, fractal geometry is very useful in studying and modeling certain scale-invariant biological structures or systems which may not be easily described with Euclidean shapes.  相似文献   

9.
The solution of Schrödinger's equation leads to a high number N of independent variables. Furthermore, the restriction to (anti)symmetric functions implies some complications. We propose a sparse-grid approximation which leads to a set of non-orthogonal basis. Due to the antisymmetry, scalar products are expressed by sums of N×N-determinants. Because of the sparsity of the sparse-grid approximation, these determinants can be reduced from N×N to a much smaller size K×K. The sums over all permutations reduce to the quantities det K 1,…,α K ):=∑≤i 1,i 2,…,i K Ndet(a ,i β (αβ))α,β=1,…, K to be determined, where a i , j (αβ) are certain one-dimensional scalar products involving (sparse-grid) basis functions ?αβ. We propose a method to evaluate this expression such that the asymptotics of the computational cost with respect to N is O(N 3) for fixed K, while the storage requirements increase only with the factor N 2. Furthermore, we describe a parallel version (N processors) with full speed up.  相似文献   

10.
Topological design of intereonnection network is a key factor of developingparallel/distributed processing systems composed of a large number of microcomputermodules. For this purpose a double-chordal ring intereonnection network was proposed. Themost attractive of its advantages is that for an optimally designed network with N modules itsdiameter can he reduced to O(N~(1/3)) compared with O(N~(1/2)) for a simple chordal ring. Theessential properties of double-chordal ring network arc presented, and formulae for calculatingits diameter are derived. These formulae lead to a distributed computational routing algorithmand a way of optimization of the network parameters (maximal number of nedes and optimalchordal lengths) for a given diameter.  相似文献   

11.
The shear crack, propagating spontaneously on a frictional interface, is a useful idealization of a natural earthquake. However, the corresponding boundary value problems are quite challenging in terms of required memory and processor power. While the huge computation amount is reduced by the spectral boundary integral method, the computation effort is still huge. In this paper, a recursive method for the evaluation of convolution integrals was tested in the spectral formulation of the boundary integral method applied to 2D anti-plane crack propagation problems. It is shown that analysis of a 2D anti-plane crack propagation problem involving Nt time steps, based on the recursive evaluation of convolution integrals, requires O(αNt) computational resources for each Fourier mode (as opposed to O(Nt2) for a classical algorithm), where α is a constant depending on the implementation of the method with typical values much less than Nt. Therefore, this recursive scheme renders feasible investigation of long deformational processes involving large surfaces and long periods of time, while preserving accuracy. The computation methodology implemented here can be extended easily to 3D cases where it can be employed for the simulation of complex spontaneously fault rupture problems which carry a high computational cost.  相似文献   

12.
A novel reconfigurable network referred to as the Reconfigurable Multi-Ring Network (RMRN) is described. The RMRN is shown to be a truly scalable network, in that each node in the network has a fixed degree of connectivity and the reconfiguration mechanism ensures a network diameter of O(log2N) for an N-processor network. Algorithms for the 2-D mesh and the SIMD n-cube are shown to map very elegantly onto the RMRN. Basic message passing and reconfiguration primitives for the SIMD RMRN are designed which could be used as building blocks for more complex parallel algorithms. The RMRN is shown to be a viable architecture for image processing and computer vision problems via the parallel computation of the Hough transform. The parallel implementation of the Y-angle Hough transform of an N × N image is showed to have a asymptotic complexity of O(Y log2Y + log2N) on the SIMD RMRN with O(N2) processors. This compares favorably with the O(Y + log2N) optimal algorithm for the same Hough transform on the MIMD n-cube with O(N2) processors.  相似文献   

13.
We present a method called the Truncation method for computing Walsh-Hadamard transforms of one- and two-dimensional data. In one dimension, the method uses binary trees as a basis for representing the data and computing the transform. In two dimensions, the method uses quadtrees (pyramids), adaptive quad-trees, or binary trees as a basis. We analyze the storage and time complexity of this method in worst and general cases. The results show that the Truncation method degenerates to the Fast Walsh Transform (FWT) in the worst case, while the Truncation method is faster than the Fast Walsh Transform when there is coherence in the input data, as will typically be the case for image data. In one dimension, the performance of the Truncation method for N data samples is between O(N) and O(N log2N), and it is between O(N2) and O(N2 log2N) in two dimensions. Practical results on several images are presented to show that both the expected and actual overall times taken to compute Walsh transforms using the Truncation method are less than those required by a similar implementation of the FWT method.  相似文献   

14.
Clustering is a basic operation in image processing and computer vision, and it plays an important role in unsupervised pattern recognition and image segmentation. While there are many methods for clustering, the single-link hierarchical clustering is one of the most popular techniques. In this paper, with the advantages of both optical transmission and electronic computation, we design efficient parallel hierarchical clustering algorithms on the arrays with reconfigurable optical buses (AROB). We first design three efficient basic operations which include the matrix multiplication of two N×N matrices, finding the minimum spanning tree of a graph with N vertices, and identifying the connected component containing a specified vertex. Based on these three data operations, an O(log N) time parallel hierarchical clustering algorithm is proposed using N3 processors. Furthermore, if the connectivity of the AROB with four-port connection is allowed, two constant time clustering algorithms can be also derived using N4 and N3 processors, respectively. These results improve on previously known algorithms developed on various parallel computational models.  相似文献   

15.
A software library is presented for the polynomial expansion method (PEM) of the density of states (DOS) introduced in [Y. Motome, N. Furukawa, J. Phys. Soc. Japan 68 (1999) 3853; N. Furukawa, Y. Motome, H. Nakata, Comput. Phys. Comm. 142 (2001) 410]. The library provides all necessary functions for the use of the PEM and its truncated version (TPEM) in a model independent way. The PEM/TPEM replaces the exact diagonalization of the one electron sector in models for fermions coupled to classical fields. The computational cost of the algorithm is O(N)—with N the number of lattice sites—for the TPEM [N. Furukawa, Y. Motome, J. Phys. Soc. Japan 73 (2004) 1482] which should be contrasted with the computational cost of the diagonalization technique that scales as O(N4). The method is applied for the first time to a double exchange model with finite Hund coupling and also to diluted spin-fermion models.

Program summary

Title of library:TPEMCatalogue identifier: ADVKProgram summary URL:http://cpc.cs.qub.ac.uk/summaries/ADVKProgram obtainable from: CPC Program Library, Queen's University of Belfast, N. IrelandNo. of lines in distributed program, including test data, etc.: 1707No. of bytes in distributed program, including test data, etc.: 13 644Distribution format:tar.gzOperating system:Linux, UNIXNumber of files:4 plus 1 test programProgramming language used:CComputer:PCNature of the physical problem:The study of correlated electrons coupled to classical fields appears in the treatment of many materials of much current interest in condensed matter theory, e.g., manganites, diluted magnetic semiconductors and high temperature superconductors among others.Method of solution: Typically an exact diagonalization of the electronic sector is performed in this type of models for each configuration of classical fields, which are integrated using a classical Monte Carlo algorithm. A polynomial expansion of the density of states is able to replace the exact diagonalization, decreasing the computational complexity of the problem from O(N4) to O(N) and allowing for the study of larger lattices and more complex and realistic systems.  相似文献   

16.
To better understand the characteristics of seepage-pores (pore radius larger than 100 nanometers) and their influence on the permeability of coals, we have conducted fractal analyses for 34 fresh coal samples (mean maximum vitrinite reflectance Ro,max from 0.43% to 4.21%) from North, Northwest and Northeast China. Mercury porosimetry data indicate that the coals are fractal, with pore radius ranging from 0.1 to 50 μm. Calculated fractal dimensions of these coals range from 2.61 to 2.98, higher than those from other kinds of rocks such as sandstone, shale, and carbonate. The data suggest that the coals have more complicated and inhomogeneous pore structures than other rocks.The fractal dimension has a negative correlation with the petrologic permeability of coals, particularly for higher rank coals (with 1.47–4.21% Ro,max), from which a strong negative linear correlation (R2=0.85) between fractal dimension and permeability is observed. A ‘U-shaped’ trend between fractal dimensions and coal ranks is observed, with the minimum fractal dimensions occurring at 1.1–1.3% Ro,max. The sub-bituminous, high volatile bituminous, semi-anthracite, and anthracite have higher fractal dimensions. The effects of coal rank upon fractal dimensions are mainly due to the variety of micropore contents and aromaticity of coals with coalification.  相似文献   

17.
We present a simple parallel algorithm for the single-source shortest path problem in planar digraphs with nonnegative real edge weights. The algorithm runs on the EREW PRAM model of parallel computation in O((n2ε+n1−ε) log n) time, performing O(n1+ε log n) work for any 0<ε<1/2. The strength of the algorithm is its simplicity, making it easy to implement and presumable quite efficient in practice. The algorithm improves upon the work of all previous parallel algorithms. Our algorithm is based on a region decomposition of the input graph and uses a well-known parallel implementation of Dijkstra's algorithm. The logarithmic factor in both the work and the time can be eliminated by plugging in a less practical, sequential planar shortest path algorithm together with an improved parallel implementation of Dijkstra's algorithm.  相似文献   

18.
The temporal evolution of a class of one-dimensional cellular automata (CA) implementable in VLSI, is a problem of considerable complexity in the presence of the natural (null) boundary conditions. This paper provides an analytical solution to their evolution, based upon a method of image charges borrowed from electrostatics. Reductions in computational effort by factors of O(L2) for a single site value, or O(L) for the entire configuration of the CA, as compared to direct simulation, are obtainable by the present method of image charges. These results are expected to provide a basis for CA applications as highly parallel computational structures to be incorporated as functional blocks is novel VLSI architectures.  相似文献   

19.
《Parallel Computing》1997,23(13):2041-2065
A parallel diagonally scaled dynamic alternating-direction-implicit (DSDADI) method is shown to be an effective algorithm for solving the 2D and 3D steady-state diffusion equation on large uniform Cartesian grids. Empirical evidence from the parallel solution of large gridsize problems suggests that the computational work done by DSDADI to converge over an Nd grid with continuous diffusivity is of lower order than O(Nd+α) for any fixed α > 0. This is in contrast to the method of diagonally scaled conjugate gradients (DSCG), for which the computational work necessary for convergence is O(Nd+1). Furthermore, the combination of diagonal scaling, spatial domain decomposition (SDD), and distributed tridiagonal system solution gives the DSDADI algorithm reasonable scalability on distributed-memory multiprocessors such as the CRAY T3D. Finally, an approximate parallel tridiagonal system solver with diminished interprocessor communication exhibits additional utility for DSDADI.  相似文献   

20.
We have developed a parallel algorithm for radial basis function (rbf) interpolation that exhibits O(N) complexity, requires O(N) storage, and scales excellently up to a thousand processes. The algorithm uses a gmres iterative solver with a restricted additive Schwarz method (rasm) as a preconditioner and a fast matrix-vector algorithm. Previous fast rbf methods — achieving at most O(NlogN) complexity — were developed using multiquadric and polyharmonic basis functions. In contrast, the present method uses Gaussians with a small variance with respect to the domain, but with sufficient overlap. This is a common choice in particle methods for fluid simulation, our main target application. The fast decay of the Gaussian basis function allows rapid convergence of the iterative solver even when the subdomains in the rasm are very small. At the same time we show that the accuracy of the interpolation can achieve machine precision. The present method was implemented in parallel using the petsc library (developer version). Numerical experiments demonstrate its capability in problems of rbf interpolation with more than 50 million data points, timing at 106 s (19 iterations for an error tolerance of 10? 15) on 1024 processors of a Blue Gene/L (700 MHz PowerPC processors). The parallel code is freely available in the open-source model.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号