首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
There are two ways, other than the standard fast Fourier transform (FFT) algorithm, of computing Fourier transforms of real data, namely, (1)the real fast Fourier transform (RFFT) algorithm, and (2) the fast Hartley transform (FHT) algorithm. On a sequential computer, it has been shown that both the RFFT and the FHT algorithms are faster than the FFT algorithm. However, it is not obvious that the same is true on a parallel machine. The communication requirements of the RFFT and the FHT algorithms, which are critical to the cost of any parallel implementation, are different from those of the FFT algorithm. In this paper we present efficient implementations of the RFFT and the FHT algorithms on a hypercube machine. Experimental results are given for the implementation of the RFFT and the FHT algorithms on the NCUBE machine.  相似文献   

2.
An application-specific architecture for the parallel calculation of the decimation in time and radix 2 fast Hartley (FHT) and Fourier (FFT) transforms is presented. A real sequence with N=2n data items is considered as input. The system calculates the FHT and the FFT in n and n+1 stages. respectively. The modular and regular parallel architecture is based on a constant geometry algorithm using butterflies of four data items and the perfect unshuffle permutation. With this permutation, the mapping of the algorithm in VLSI technology is simplified and the communications among processors are minimized. Organization of the processor memory based on first-in, first-out (FIFO) queues facilitates a systolic data flow and permits the implementation in a direct way of the complex data movements and address sequences of the transforms. This is accomplished by means of simple multiplexing operations, using hardwired control. The total calculation time is (Nlog2N)/4Q cycles for the FHT and N(1+log2N)/4Q cycles for the FFT, where Q is the number of processors ( Q= 2q, QN/4)  相似文献   

3.
Le-Ngoc  T. Vo  M.T. 《Micro, IEEE》1989,9(5):20-27
The authors investigated the implementation aspects of the fast Hartley transform (FHT) in both software and hardware. They describe the modifications required to convert existing fast Fourier transform (FFT) programs to execute FHTs, showing the ease with which these modifications can be implemented. They compare execution time and memory storage requirements of both transforms and present power spectrum calculation and convolution as illustrative examples to compare the performances of the two transform techniques. They also give a comparative survey of the performances of various microprocessors and digital signal processors in FFT and FHT computation  相似文献   

4.
快速霍夫变换算法   总被引:37,自引:0,他引:37  
孙丰荣  刘积仁 《计算机学报》2001,24(10):1102-1109
二值图像的直线检测过程中,标准霍夫变换算法的计算量为O(N^3)。该文提出一种快速霍夫变换算法,其计算量仅为O(N^2log2N)。该快速算法可以并行实现;处理器阵列规模为O(N^2)时,计算量为O(log2N)。文中还分析得到快速算法的误差上界,并提出一种改进的快速霍夫变换算法以获得更高的计算精度。最后,给出算法的数值算例。理论分析及数值算例都表明,该文的快速霍夫变换算法在直线检测过程中有着更高的计算效率,并且具有良好的计算精度。  相似文献   

5.
6.
快速Hadamard变换被广泛应用于信号与图像处理、通信系统、数字逻辑等领域中.当问题规模非常大时,快速Hadamard变换有可能不能满足计算时间的要求;这种情况下,算法并行化是一种行之有效的手段.本文以单像素相机的压缩感知图像复原为应用背景,利用基二快速Hadamard变换与快速傅里叶变换的结构相似性,提出一种通用的基二快速Hadamard变换的任务级并行算法,并用构造方式证明了该并行算法与串行算法计算结果之间的等价性.仿真表明对于小于220向量长度的问题规模以及并行子任务数少于210的情况,该并行算法对比串行算法的数值计算结果的欧氏距离平方误差小于10-18,佐证了并行算法的正确性.在PC平台通过多核CPU上POSIX线程实现的实验表明:在该特定平台和特定配置上对于220至225向量长度的问题规模并行计算加速比为1.33~1.42,证明了文中提出方法的可行性和有效性.  相似文献   

7.
根据网络蠕虫攻击的特点,建立了能够反映蠕虫扫描特征的失败连接流量(FCT)时间序列,提出了一种基于FCT时间序列小波包能量特征和支持向量机(SVM)的蠕虫检测新方法。该方法利用小波包分析计算FCT时间序列在各频带投影序列的能量分布,获得能够表征蠕虫扫描的特征向量,使用经过样本训练的SVM分类器进行分类,实现蠕虫攻击扫描的自动检测。实验结果表明,该方法能够比较准确地检测蠕虫攻击,和理论值相比,漏报率低于6%,误报率低于1%。  相似文献   

8.
The self-sorting (SS) algorithm is a highly efficient version of the fast Fourier transform (FFT), because, unlike the generally used algorithms, it does not require shuffling the sequence to be transformed (digit reversal), In this work, we propose a parallel architecture that implements the SS radix r (r ≥ 2) algorithm. The data flow of the algorithm is decomposed, in a natural way, into two sections that are implemented by means of FIFO queues located in the processors and an interprocessor connection network (perfect unshuffle). The resulting design is regular and modular, and, whenever possible, presents constant geometry. The total processing time required is nN/rQ cycles for a transform of size N = rn computed using Q = rq processors. Consequently, there are no cycle losses.  相似文献   

9.
Algorithms of parallel computation of multidimensional discrete orthogonal transforms that rely on a previously developed approach to paralleling discrete Fourier transforms and methods of reducing different discrete orthogonal transforms (discrete Hartley transform, discrete cosine transform, etc.) to discrete Fourier transforms of a special form are considered.  相似文献   

10.
Residual diffusion in fluid-dynamics calculations results from the finite order of approximation in the underlying linear algorithm, including the effect of smoothing sometimes added for numerical reasons, and, in the case of monotonicity-preserving algorithms such as flux-corrected transport (FCT), the nonlinear action of the flux limiter on steep profiles. Some widely used FCT algorithms contain a multiplicative constant that reduces the antidiffusion coefficient by 0.01%–0.1%. Replacing this constant with a smoothly varying function of velocity which equals unity when the Courant number vanishes causes the linear diffusion to go to zero when the flow velocity does. The use of a velocity-dependent antidiffusion coefficient minimizes numerical smearing of discontinuities and associated effects in the neighboring flow. Computational examples are presented. The residual diffusion for nonzero flow speeds is nonlinear and problem dependent. A method is presented for calibrating it in any given code in the context of a particular problem, and is applied to the FCT algorithms described here.  相似文献   

11.
《Graphical Models》2012,74(4):221-232
This paper systematically studies the well-known Mexican hat wavelet (MHW) on manifold geometry, including its derivation, properties, transforms, and applications. The MHW is rigorously derived from the heat kernel by taking the negative first-order derivative with respect to time. As a solution to the heat equation, it has a clear initial condition: the Laplace–Beltrami operator. Following a popular methodology in mathematics, we analyze the MHW and its transforms from a Fourier perspective. By formulating Fourier transforms of bivariate kernels and convolutions, we obtain its explicit expression in the Fourier domain, which is a scaled differential operator continuously dilated via heat diffusion. The MHW is localized in both space and frequency, which enables space-frequency analysis of input functions. We defined its continuous and discrete transforms as convolutions of bivariate kernels, and propose a fast method to compute convolutions by Fourier transform. To broaden its application scope, we apply the MHW to graphics problems of feature detection and geometry processing.  相似文献   

12.
This paper proposes an optimized design of Discrete Hilbert Transform (DHT) processor using Complex Binary Number System (CBNS). The conventional implementation of DHT based on the “divide and conquer” approach involves two separate computational units for the real and imaginary parts, which requires a large silicon area and increases the path delay. In contrast, incorporation of CBNS in transformation techniques facilitates complex-valued signal processing through a single computational unit.The CBNS-DHT processor has been designed using the standard computational method of Fast Fourier Transform (FFT). The 2-D Systolic Array architecture along with a novel processing element has been proposed for CBNS based Complex-valued FFT (CFFT) and Inverse FFT (CIFFT) computations. The architecture of CBNS-CFFT/CIFFT has been extended to develop the CBNS-DHT processor on the Zynq-7000 family, XC7Z020-CLG484 FPGA platform. A comparative performance analysis of CBNS-DHT and Normal Binary Number System (NBNS)-DHT highlights the efficiency of CBNS-DHT in terms of VLSI parameters — silicon area, path-delay and memory utilization. CBNS-CFFT shows significant improvement in path delay and area consumption as compared to NBNS-CFFT for both Twiddle Factors and FFT size, which proves that CBNS based CFFT and DHT processor design is more efficient in terms of speed and area requirements.  相似文献   

13.
In this article, parallel computation of manipulator inverse dynamics is investigated. A hierarchical graph-based mapping approach is devised to analyze the inherent parallelism in the Newton-Euler formulation at several computational levels, and to derive the features of an abstract architecture for exploitation of parallelism. At each level, a parallel algorithm represents the application of a parallel model of computation that transforms the computation into a graph whose structure defines the features of an abstract architecture, i.e., number of processors, communication structure, etc. Data flow analysis is employed to derive the time lower bound in the computation as well as the sequencing of the abstract architecture. The features of the target architecture are defined by optimization of the abstract architecture to exploit maximum parallelism while minimizing various overheads and architectural complexity. An algorithmically specialized, highly parallel, MIMD-SIMD architecture is designed and implemented that is capable of efficient exploitation of parallelism at several computational levels. The computation time of the Newton-Euler formulation for a 6-degree-of-freedom (dof) general manipulator is measured as 187 μs. The increase in computation time for each additional dof is 23 μs, which leads to a computation time of less than 500 μs, even for a 12-dof redundant arm.  相似文献   

14.
The implementation of fast numerical methods on parallel and vector computers is illustrated by describing the development of fast Fourier transform routines for the vector-processing Cray-1 and Cyber 205 machines. Various vectorization methods are presented for FFT's on the Cray-1. By performing a number of transforms in parallel, “super-vector” performance can be achieved. By modifying the algorithms slightly, multiple transforms can be implemented faster on the Cyber 205 (using 64-bit arithmetic on the 2-pipe model) than on the Cray-1, provided that enough transforms (of order 100) can be performed in parallel.  相似文献   

15.
By introducing a form of reorder for multidimensional data, we propose a unified fast algo-rithm that jointly employs one-dimensional W transform and multidimensional discrete polynomial trans-form to compute eleven types of multidimensional discrete orthogonal transforms, which contain three types of m-dimensional discrete cosine transforms ( m-D DCTs) ,four types of m-dimensional discrete W transforms ( m-D DWTs) ( m-dimensional Hartley transform as a special case), and four types of generalized discrete Fourier transforms ( m-D GDFTs). For real input, the number of multiplications for all eleven types of the m-D discrete orthogonal transforms needed by the proposed algorithm are only 1/m times that of the commonly used corresponding row-column methods, and for complex input, it is further reduced to 1/(2m) times. The number of additions required is also reduced considerably. Furthermore, the proposed algorithm has a simple computational structure and is also easy to be im-plemented on computer, and th  相似文献   

16.
An elementary and transparent representation of the fast Fourier transform is given. Instead of using the usual and highly algebraic approach it is shown how a Fourier transform of the ordern=p·m can be reduced top Fourier transforms of orderm by performing essentiallym Fourier transforms of orderp on the data. The resulting process is discussed in more detail forn=3 q andn=5 q . The problem of retrieval of the wanted coefficients from the final data is solved by a simple argument. The generalization for an ordern equal to a product of powers of prime numbers is immediate.  相似文献   

17.
多核图像处理并行设计范式的研究与应用   总被引:1,自引:0,他引:1       下载免费PDF全文
王成良  谢克家  刘昕 《计算机工程》2011,37(14):220-222
多核计算环境下采用图像处理并行算法可提高图像处理的速度,但已有的并行设计只针对边缘检测、图像投影等特定算法进行,没有形成通用的并行算法设计范式。为此,在研究图像处理算法可并行处理机制和多核架构特点的基础上,提出分析、建模、映射、调试和性能评价及测试发布等5个设计步骤的基于多核计算环境的图像处理算法并行设计范式,以图像傅里叶变换并行算法设计为例在单核、双核、四核、八核计算环境下验证了该并行范式的有效性。实验结果表明,该范式在图像处理并行设计方面可扩展图像处理的应用空间。  相似文献   

18.
A computer program is described which performs power-spectral analyses on time-domain data. The program is written in the C language and incorporates an algorithm for the fast Fourier transform translated from BASIC into C. Sequential segments of time-domain data are accessed by the program, transformed to the frequency domain, and ensemble-averaged to generate smoothed spectra. Specific application of the program to the detection of high-frequency oscillations in the phrenic neurogram of the cat is addressed. Thus, 100 successive 512-point fast Fourier transforms were found to accurately reveal the relative strength (power) and frequency position (spectrum) of multiple peaks in this respiratory motor pathway. Because C language programs are very transportable, this program should run on machines other than our LSI 11/23, provided a C-compiler is available.  相似文献   

19.
Traffic modeling is a key step in several intelligent transportation systems (ITS) applications. This paper regards the traffic modeling through the enhancement of the cell transmission model. It considers the traffic flow as a hybrid dynamic system and proposes a piecewise switched linear traffic model. The latter allows an accurate modeling of the traffic flow in a given section by considering its geometry. On the other hand, the piecewise switched linear traffic model handles more than one congestion wave and has the advantage to be modular. The measurements at upstream and downstream boundaries are also used in this model in order to decouple the traffic flow dynamics of successive road portions. Finally, real magnetic sensor data, provided by the performance measurement system on a portion of the Californian SR60-E highway are used to validate the proposed model.  相似文献   

20.
A program for a direct solution of the Poisson equation in cylindrically symmetric geometry is described. It is based on the use of fast Fourier transforms for the axial solution, and an expansion in cubic B-splines for the radial solution.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号