首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 156 毫秒
1.
树结构在N体问题中的应用*   总被引:1,自引:0,他引:1  
N体问题的数值模拟在每个时间步都需要计算每对粒子之间的相互作用,其复杂度为O(N2).采用树结构代码不仅减少了存储开销,而且更有利于快速计算和并行划分.Barnes-Hut算法(BHA)和快速多极子方法(FMM)都是基于树结构的快速算法.BHA可快速计算各点受到的场力,计算复杂度为O(N log N),但计算精度通常只有1%;FMM通过层次划分和位势函数的多极子展开计算各点位势,其复杂度为O(N),却能达到任意精度.数值结果表明,树结构的并行效果也很好.  相似文献   

2.
矩量法(MOM)离散电场积分方程(EFIE)得到稠密的线性方程组,它可以用迭代法(比如本文中的TFQMR方法)求解.每次迭代过程中,矩阵与向量的乘积的复杂度为O(N2).采用多层快速多极子方法(MLFMM),可将其降到O(N log N).采用基于球谐变换的快速傅立叶变换,可进一步加快MLFMM的层间插值计算.数值结果显示MLFMM求解EFIE是可行的.  相似文献   

3.
A unified vector sorting algorithm(VSA) is proposed,which sorts N arbitrary numbers with c log2 N-bits on an SIMD multi-processor system (SMMP) with p=N^1 ε/u processors and a composite interconnected network in T=c/ε(4 log2 N-2 log2 u 10u) time,where c is an arbitrary positive constant.When ε is an arbitrary small positive constant and u=log2 N,it is an O(log N) algorithm and p=N^1 ε/log2 N;when ε=1/log N and u=2 log2 N,it is an optimal algorithm (p=N/log2 N,T=O(log^2 N),pT=O(N log N));where u=1,c=1 and ε=0.5 (a constant).  相似文献   

4.
基于分级的快速霍夫变换直线检测   总被引:2,自引:0,他引:2  
易玲 《微计算机信息》2007,23(31):206-208
在分析标准霍夫变换、两点表决霍夫变换以及多级霍夫变换直线检测的基础上,针对算法的不足,结合他们的优点,设计并实现了基于分级的快速霍夫变换直线检测方法,对算法进行了详细描述和分析,并通过实验证明了算法的有效性,实验表明:所设计的直线检测运算速度快,检测精度高,鲁棒性强,有一定的应用价值。  相似文献   

5.
陈宏建  陈崚  秦玲  徐晓华  屠莉 《计算机工程》2004,30(24):17-18,191
在Y.Pan提出的基于流水光总线阵列模型(LARPBS)上使用N个处理器对N个元素进行排序在最好情况下以O(logN)时间,最坏情况下以O(N)时间完成的并行排序算法的基础上,提出了一种LARPBS模型上的可扩展的快速并行排序算法,对N个元素进行排序,使用p(1≤P≤N)个处理器在最好情况下以O(NlogN/p)时间,最坏情况下以O(N^2/p)时间完成排序。另外还提出了一种LARPBS模型上改进的快速高效并行排序算法,该算法对N个元素进行排序使用N个处理器在最好情况下以O(log√N)时间、最坏情况下以O(√N)时间完成排序。  相似文献   

6.
一种快速霍夫变换算法   总被引:8,自引:0,他引:8  
霍夫变换是图像处理中的一种常用的检测算法,能够有效地在较大的噪声环境中提取图像中的特定信息。但标准的霍夫变换算法运算量大,处理速度慢,有较大的局限性。该文讨论了一种快速霍夫变换算法,该算法有效地降低了传统霍夫变换算法的时间复杂度,提高了计算效率和运算速度,对于提高图像处理的速度,增强图像处理的实时性有着显著的作用。  相似文献   

7.
张满  陶亮 《微机发展》2012,(10):133-135
离散Hartley变换是一种有用的实值正交变换。文中对其快速算法进行研究,首先介绍利用算术傅里叶变换(AFT)计算离散傅里叶变换(DFT)可使其乘法计算量仅为O(N),然后文章根据这一特点,分析离散Hartley变换(DHT)的结构特征,通过DFT将AFT和DHT建立了直接联系,提出了一种新的快速DHT算法。算法的计算复杂度能够达到线性O(N),且算法结构简单,公式统一且易于实现,并与其他快速算法进行了比较,分析可知在数据长度不是2的幂次方时,文中提出的算法的计算时间明显比其他算法的计算时间要小。实验结果也验证了文中算法的有效性,从而为DHT的快速计算开辟了新的思路和途径。  相似文献   

8.
车道线检测是智能辅助驾驶算法中的核心算法之一。为了解决基于传统霍夫变换的车道线检测算法检测效率低下等问题,提出一种基于级联霍夫变换的快速车道线检测算法。该算法首先对视频帧进行ROI选取、滤波、边缘检测、非极大值抑制等预处理,然后使用基于平行坐标系的映射将原始图像转换到参数空间,完成点到线、线到点的映射,接着再使用一次映射,最终实现点到点、线到线的映射,以此快速提取车道线消失点,并根据消失点位置扫描实际车道线,实现车道线的提取。该算法在点的映射过程中,坐标值始终是线性变换,克服了传统霍夫变换在映射过程时需对每一个点进行极坐标转换的缺点,计算更简单,运算效率更高。仿真实验表明,文中提出的改进算法比传统霍夫变换运算速度提高了31%,准确率提高了6.2%,检测效果有明显提高,可广泛应用于智能辅助驾驶中。  相似文献   

9.
本文利用m+n阶Sylvester矩阵的位移结构并在假设该矩阵的所有顺序主子矩阵可逆的条件下给出了求解Sylvester矩阵的逆的一种快速算法.该算法所需计算量为O(m+n)~2,而高斯-约当消去法所需计算量为O(m+n)~3.最后通过数值算例说明了算法的有效性.  相似文献   

10.
文中通过多次量子Fourier变换和变量代换,给出了一个ZN上离散对数量子计算算法,刻画了元素的阶r与算法成功率的关系,当r为素数时,算法成功的概率接近于1,新算法所需基本量子门数的规模为O(L3),且不需要执行函数|f(x1,x2)〉的量子Fourier变换的反演变换,优于已有的ZN上离散对数量子计算算法,其中L=[log N]+1.  相似文献   

11.
The Hough transform is an important problem in image processing and computer vision. An efficient algorithm for computing the Hough transform has been proposed on a reconfigurable array by Kao et al. (1995). For a problem with an √N×√N image and an n×n parameter space, the algorithm runs in a constant time on a three-dimensional (3-D) n×n×N reconfigurable mesh where the data bus is N1c/-bit wide. To our best knowledge, this is the most efficient constant-time algorithm for computing the Hough transform on a reconfigurable mesh. In this paper, an improved Hough transform algorithm on a reconfigurable mesh is proposed. For the same problem, our algorithm runs in constant time on a 3-D n*n×n×√n√n reconfigurable mesh, where the data bus is only log N-bit wide. In most practical situations, n=O(√N). Hence, our algorithm requires much less VLSI area to accomplish the same task. In addition, our algorithm can compute the Radon transform (a generalized Hough transform) in O(1) time on the same model, whereas the algorithm in the above paper cannot be adapted to computing Radon transform easily  相似文献   

12.
平行十二面体区域上的快速离散傅立叶变换及其并行实现   总被引:3,自引:0,他引:3  
§1.引言 快速傅立叶变换在信号处理、多媒体压缩、模式识别、计算化学等众多领域有着广泛的应用,它是公认的二十世纪最重要的十个算法之一.2002年高性能计算界影响最大的成果之一即是Mitsuo Yokokawa等在Earth Simulator上利用三维FFT成功的计算了网格尺寸为2048 × 2048×2048的湍流问题.但现有的快速傅立叶方法在实现高维傅立叶变换(HFT)时多是通过张量积方法将高维问题转化为低维问题来解决,它所能处理的区域  相似文献   

13.
分析图像中最近邻直线间距离和夹角的非均匀性,得到这两个参数与线段方向和长度的相关关系,并由此提出了非均匀量化Hough空间的直线检测算法NUHT (Nonuniform Hough Transform).实验结果表明,NUHT在不降低运算效率的情况下,有2倍于标准Hough变换SHT (Standard Hough Transform)的直线段检测能力,同时误报率低于SHT的1/2.  相似文献   

14.
We present an O((log log N)/sup 2/) -time algorithm for computing the distance transform of an N /spl times/ N binary image. Our algorithm is designed for the common concurrent read concurrent write parallel random access machine (CRCW PRAM) and requires O(N/sup 2+/spl epsi///log log N) processors, for any /spl epsi/ such that 0 < /spl epsi/ < 1. Our algorithm is based on a novel deterministic sampling scheme and can be used for computing distance transforms for a very general class of distance functions. We also present a scalable version of our algorithm when the number of processors is available p/sup 2+/spl epsi///log log p for some p < N. In this case, our algorithm runs in O((N/sup 2//p/sup 2/)+(N/p) log log p + (log log p)/sup 2/) time. This scalable algorithm is more practical since usually the number of available processors is much less than the size of the image.  相似文献   

15.
基于广义霍夫变换的芯片检测   总被引:1,自引:1,他引:0       下载免费PDF全文
张小军  胡福乔 《计算机工程》2009,35(23):252-254
传统的广义霍夫变换空间复杂度及时问复杂度都很高,不适用于实时的应用。针对该问题,提出一种基于广义霍夫变换的芯片检测算法,降低了计算复杂度。该算法的主要思想是将多尺度分析与广义霍夫变换相结合。将该算法应用到自动光学检测系统的芯片检测中,取得了较好的检测结果。  相似文献   

16.
The performance of a multiprocessor system depends heavily on its ability to provide conflict free paths among its processors. In this paper, we explore the possibility of using a nonblocking network with O(N log N) edges (crosspoints) to interconnect the processors of an N processor system, We combine Bassalygo and Pinsker's implicit design of strictly nonblocking networks with an explicit construction of expanders to obtain a strictly nonblocking network with -765.18N+352.8N log N edges and 2+log(N/5) depth. We present an efficient parallel algorithm for routing connection requests on this network and implement it on three parallel processor topologies. The implementation on a parallel processor whose processing elements are interconnected as in the Bassalygo-Pinsker network requires O(N log N) processing elements, O(N log N) interprocessor links and it takes O(log N) steps to route any single connection request where each step involves a small number (≈72) of bit-level operations. A contracted or folded version of the same implementation reduces the processing element count to O(N) without increasing the link count or the routing time. Finally, we establish that the same algorithm takes O(log3 N) steps on a perfect shuffle processor with O(N) processing elements. These results improve the crosspoint, depth and routing time complexities of the previously reported strictly nonblocking networks  相似文献   

17.
The computation model on which the algorithms are developed is the reconfigurable array of processors with wider bus networks (abbreviated to RAPWBN). The main difference between the RAPWBN model and other existing reconfigurable parallel processing systems is that the bus width of each network is bounded within the range [2,[/spl radic/(N)]]. Such a strategy not only saves the silicon area of the chip as well as increases the computational power enormously, but the strategy also allows the execution speed of the proposed algorithms to be tuned by the bus bandwidth. To demonstrate the computational power of the RAPWBN, the channel-assignment problem is derived in this paper. For the channel-assignment problem with N pairs of components, we first design an O(T + [N//spl omega/]) time parallel algorithm using 2N processors with a 2N-row by 2N-column bus network, where the bus width of each bus network is /spl omega/-bit for 2 /spl les/ /spl omega/ /spl les/ [/spl radic/N] and T = [log/sub /spl omega//N] + 1. By tuning the bus bandwidth to the natural log N-bit and the extended N/sup 1/c/-bit (N/sup 1/c/ > log N) for any constant c and c /spl ges/ 1, two more results which run in O(log N/log log N) and O(1) time, respectively, are also derived. When compared to the algorithms proposed by Olariu et al. [17] and Lin [14], it is shown that our algorithm runs in the equivalent time complexity while significantly reducing the number of processors to O(N).  相似文献   

18.
In spite of their good filtering characteristics for vector-valued image processing, the usability of vector median filters is limited by their high computational complexity. Given an N × N image and a W × W window, the computational complexity of vector median filter is O(W4N2). In this paper, we design three fast and efficient parallel algorithms for vector median filtering based on the 2-norm (L2) on the arrays with reconfigurable optical buses (AROB). For 1 ⩽ p ⩽ W ⩽ q ⩽ N, our algorithms run in O(W4 log W/p4), O(W2N2/p 4q2 log W) and O(1) times using p4N2 / log W, p4q2 / log W, and W4N2 log N processors, respectively. In the sense of the product of time and the number of processors used, the first two results are cost optimal and the last one is time optimal  相似文献   

19.
快速傅里叶变换(FFT)在科学和工程领域有着广泛的应用。在网格环境下进行并行FFT计算可以提高运算速度,促进FFT的应用。在介绍了网格计算发展状况的基础上,详细阐述了基于网格的分布式并行计算。实验以FFT算法为背景,在Globus Toolkit 4平台下实现了并行FFT计算,并对实验数据作了分析,说明了基于网格的并行FFT计算的可行性。最后指出网格资源调度对并行计算的重要性。  相似文献   

20.
This paper defines the difference of low-pass (DOLP) transform and describes a fast algorithm for its computation. The DOLP is a reversible transform which converts an image into a set of bandpass images. A DOLP transform is shown to require O(N2) multiplies and produce O(N log(N)) samples from an N sample image. When Gaussian low-pass filters are used, the result is a set of images which have been convolved with difference of Gaussian (DOG) filters from an exponential set of sizes. A fast computation technique based on ``resampling' is described and shown to reduce the DOLP transform complexity to O(N log(N)) multiplies and O(N) storage locations. A second technique, ``cascaded convolution with expansion,' is then defined and also shown to reduce the computational cost to O(N log(N)) multiplies. Combining these two techniques yields an algorithm for a DOLP transform that requires O(N) storage cells and requires O(N) multiplies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号