期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

FPGA implementation of high-performance,resource-efficient Radix-16 CORDIC rotator based FFT algorithm

《Integration, the VLSI Journal》2020

The fast Fourier transform (FFT) is an algorithm widely used to compute the discrete Fourier transform (DFT) in real-time digital signal processing. High-performance with fewer resources is highly desirable for any real-time application. Our proposed work presents the implementation of the radix-2 decimation-in-frequency (R2DIF) FFT algorithm based on the modified feed-forward double-path delay commutator (DDC) architecture on FPGA device. Need for a complex multiplier to carry out the multiplication of complex twiddle factors and large memory to store the twiddle factors are the main concerns for FFT implementation. Propose work aims to address these issues. In this work, a high-performance radix-16 COordinate Rotational DIgital Computer (CORDIC) algorithm based rotator is proposed to carry out the complex twiddle factor multiplication. Further, CORDIC needs only rotational angles to carry out complex multiplication, which reduces the need for large memory to store the twiddle factors. To compute the total rotation for n-bit precision, our proposed radix-16 CORDIC algorithm takes n/4 iteration as compared to n iteration of the radix-2 CORDIC algorithm. Our proposed architecture of the radix-2 decimation-in-frequency (R2DIF) algorithm is implemented on a Virtex−7 series FPGA. Further, the detailed comparison is presented between our proposed FFT implementation and other recently proposed FFT implementations. Experimental results suggest that proposed implementation has less latency and hardware utilization as compared to recently proposed implementations. 相似文献

2.

WFTA算法的FPGA设计与实现

魏鹏孙磊王华力《通信技术》2011,44(4):167-169

Winograd傅里叶变换算法（WFTA）利用旋转因子W的特性对其进行分解,能够把FFT运算中乘法次数降到最低,是一种高效且资源占用相对较少的FFT实现方法。以256点分解为两维16×16点的小数组WFTA进行运算为例介绍了大数组WFTA算法的FPGA设计与实现方案。仿真测试表明,所设计的256点FFT处理器,乘法器资源消耗仅为基-2FFT的1/2、基-4FFT的2/3,且在100 MHz主时钟频率下完成运算仅需5.8μs,满足FFT处理器的高速实时性要求。相似文献

3.

Reduced Memory and Low Power Architectures for CORDIC-based FFT Processors

Erdal Oruklu Xin Xiao Jafar Saniie 《Journal of Signal Processing Systems》2012,66(2):129-134

This paper presents a pipelined, reduced memory and low power CORDIC-based architecture for fast Fourier transform implementation. The proposed algorithm utilizes a new addressing scheme and the associated angle generator logic in order to remove any ROM usage for storing twiddle factors. As a case study, the radix-2 and radix-4 FFT algorithms have been implemented on FPGA hardware. The synthesis results match the theoretical analysis and it can be observed that more than 20% reduction can be achieved in total memory logic. In addition, the dynamic power consumption can be reduced by as much as 15% by reducing memory accesses. 相似文献

4.

An efficient locally pipelined FFT processor

Liang Yang Kewei Zhang Hongxia Liu Jin Huang Shitan Huang 《Circuits and Systems II: Express Briefs, IEEE Transactions on》2006,53(7):585-589

The fast Fourier transform (FFT) is a very important algorithm in digital signal processing. The locally pipelined (LPPL) architecture is an efficient structure for FFT processor designing in a real-time embedded system. Two basic building blocks, to the LPPL FFT processor, the butterfly in pipeline, and address generating, are discussed in this brief. Based on the "deep" feedback to butterfly-2, a novel approach for pipelined architecture, the radix-2 single-path deep delay feedback architecture is proposed. For length-N discrete Fourier transform computation, the dominant hardware requirements are minimal for complex multipliers log/sub 4/N-1 and adders 2log/sub 4/N. As an integral need of the LPPL FFT processor design, address generating and coefficient store-load structures are also presented. 相似文献

5.

A new radix-2/8 FFT algorithm for length-q/spl times/2/sup m/ DFTs

Bouguezel S. Ahmad M.O. Swamy M.N.S. 《IEEE transactions on circuits and systems. I, Regular papers》2004,51(9):1723-1732

In this paper, a new radix-2/8 fast Fourier transform (FFT) algorithm is proposed for computing the discrete Fourier transform of an arbitrary length N=q/spl times/2/sup m/, where q is an odd integer. It reduces substantially the operations such as data transfer, address generation, and twiddle factor evaluation or access to the lookup table, which contribute significantly to the execution time of FFT algorithms. It is shown that the arithmetic complexity (multiplications+additions) of the proposed algorithm is, in most cases, the same as that of the existing split-radix FFT algorithm. The basic idea behind the proposed algorithm is the use of a mixture of radix-2 and radix-8 index maps. The algorithm is expressed in a simple matrix form, thereby facilitating an easy implementation of the algorithm, and allowing for an extension to the multidimensional case. For the structural complexity, the important properties of the Cooley-Tukey approach such as the use of the butterfly scheme and in-place computation are preserved by the proposed algorithm. 相似文献

6.

二维级联流水结构大点数FFT运算器实现研究

王晓君龙腾周希元《无线电工程》2010,40(11):19-22

大点数快速傅里叶变换(FFT)运算在雷达、通信信号侦察中有广泛应用,其基于现场可编程门阵列(FPGA)的实现方法有重要的研究价值。推导出点数为N的大点数FFT运算分解为2级小点数FFT运算级联的运算公式,在此基础上给出其实现步骤,从流水线结构设计、基本运算单元以及地址生成等方面详细介绍一维列(行)变换的工程实现方法,并给出列、行变换之间所乘旋转因子的压缩算法。工程实际应用表明,该大点数FFT运算器具有变换速度快、调试方便及可在单片FPGA实现的优点。相似文献

7.

A reconfigurable 4-GS/s power-efficient floating-point FFT processor design and implementation based on single-sided binary-tree decomposition

《Integration, the VLSI Journal》2019

This paper presents a high throughput size-configurable floating point (FP) Fast Fourier Transform (FFT) processor, having implemented the 8-parallel multi-path delay feedback (MDF) functions suitable for applications in the real-time radar imaging system. With regard to floating-point FFT design, to acquire a high throughput with restricted area and power consumptions poses as a greater challenge due to some higher degrees of complexity involved in realizing of FP operations than those fixed-point counterparts. To address the related issues, a novel mixed-radix FFT algorithm featuring the single-sided binary-tree decomposition strategy is proposed aiming at effectively containing the complexity of multiplications for any 2^k-point FFT. To this aid, the parallel-processing twiddle factor generator and the dual addition-and-rounding fused FP arithmetic units are optimized to meet the high accuracy demand in computation and the low power budget in implementation. The proposed FP FFT processor has been designed in silicon based on SMIC's 28 nm CMOS technology with the active area of 1.39 mm². The prototype design delivers a throughput of 4 GSample/s at 500 MHz, at a peak power consumption of 84.2 mW. Thus, the proposed design approach achieves a significant improvement in power efficiency approximately by 14 times on average over some other FP FFT processors previously reported. 相似文献

8.

High‐Performance Low‐Power FFT Cores

Wei Han Ahmet T. Erdogan Tughrul Arslan Mohd. Hasan 《ETRI Journal》2008,30(3):451-460

Recently, the power consumption of integrated circuits has been attracting increasing attention. Many techniques have been studied to improve the power efficiency of digital signal processing units such as fast Fourier transform (FFT) processors, which are popularly employed in both traditional research fields, such as satellite communications, and thriving consumer electronics, such as wireless communications. This paper presents solutions based on parallel architectures for high throughput and power efficient FFT cores. Different combinations of hybrid low‐power techniques are exploited to reduce power consumption, such as multiplierless units which replace the complex multipliers in FFTs, low‐power commutators based on an advanced interconnection, and parallel‐pipelined architectures. A number of FFT cores are implemented and evaluated for their power/area performance. The results show that up to 38% and 55% power savings can be achieved by the proposed pipelined FFTs and parallel‐pipelined FFTs respectively, compared to the conventional pipelined FFT processor architectures. 相似文献

9.

WIMAX系统中可配置FFT/IFFT的设计与实现

刘德福雷天民马卓《电子科技》2010,23(3):17-19

针对WIMAX系统中变长子载波的特点,通过采用流水线乒乓结构,以基2、基4混合基实现了高速可配置的FFT/IFFT。将不同点数的FFT旋转因子统一存储,同时对RAM单元进行优化,节约了存储空间;此外对基4蝶形单元进行优化,减少了加法和乘法运算单元。仿真和综合结果表明,设计满足了WIMAX高速系统中不同带宽下FFT/IFFT的要求。相似文献

10.

An area-efficient and low-power 64-point pipeline Fast Fourier Transform for OFDM applications

《Integration, the VLSI Journal》2017

In an orthogonal frequency division multiplexing (OFDM) based wireless systems, Fast Fourier Transform (FFT) is a critical block as it occupies large area and consumes more power. In this paper, we present an area-efficient and low power 16-bit word-width 64-point radix-2² and radix-2³ pipelined FFT architectures for an OFDM-based IEEE 802.11a wireless LAN baseband. The designs are derived from radix-2^k algorithm and adopt a Single-Path Delay Feedback (SDF) architecture for hardware implementation. To eliminate the complex multipliers and read-only memory (ROM) which is used for internal storage of twiddle factor coefficients, the proposed 64-point FFT employs a Canonical Signed Digit (CSD) complex constant multiplier using adders, multiplexers and shifters. The complex constant multiplier (CCM) is modified using common sub-expression sharing block that reduces the area of the design. The proposed radix-2² and radix-2³ pipelined FFT architectures are modeled and implemented using TSMC 180 nm CMOS technology with a supply voltage of 1.8 V. The implementation results show that the proposed architectures significantly reduces the hardware cost and power consumption in comparison to existing 64-point FFT architectures. 相似文献

11.

时间抽取情况下的旋转因子合并FFT算法和TMFFT的软件实现

许蔚陈宗骘《电子与信息学报》1988,10(2):97-105

马滕斯(Martens)提出了一种效率高(可与WFTA法和PFA法相比拟)、结构简单(与FFT法相似)的DFT计算方法RGFA。作者已经证明,在基2的情况下,RCFA与旋转因子合并的频率抽取FFT算法是完全等价的。本文给出了旋转因子合并的时间抽取FFT算法,从而使得在任何条件下,目前使用的FFT算法都可以用外部特性完全相同、内部结构基本相同的高效算法旋转因子合并FFT算法来代替。本文还给出了实现旋转因子合并FFT算法的软件。相似文献

12.

TWIDDLE FACTOR MERGED TIME-DECIMAL FFT ALGORITHM AND THE SOFTWARE IMPLEMENTATION FOR TWIDDLE FACTOR MERGED FFT ALGORITHM

许蔚陈宗骘《电子科学学刊(英文版)》1989,6(1):1-9

Martens proposed a highly efficient and simply formed DFT algorithm——RCFA,whose efficien-cy is comparable with that of WFTA or that of PFA,and whose structure is similar to that of FFT.Theauthors have proved that,in the case of radix 2,the RCFA is exactly equivalent to the twiddle factor mergedfrequency-decimal FFT algorithm.The twiddle factor merged time-decimal FFT algorithm is providedin this paper.Thus,in any case,the FFT algorithm used currently can be replaced by the more efficientalgorithm——the twiddle factor merged FFT algorithm,with exactly the same external property and thesimilar internal structure.Also in this paper,the software for implementing the twiddle factor merged FFTalgorithm(TMFFT)is provided. 相似文献

13.

Hardware architecture for an anti-traffic noise system

《Microelectronics Journal》2015,46(5):370-376

This work presents an energy efficient architecture for an anti-traffic noise system. The hardware is designed for a road side unit (RSU) in intelligent transportation systems. Fast Fourier Transform is the cornerstone for the suggested system. An ultra low power architecture for the FFT suitable for FPGA implementation is derived. Bit-widths for both data and twiddle factors are optimized for low-power. The architecture uses an efficient complex multiplier that has 25% less multiplications. An algorithm to compute the number of time-shared butterflies for a given FFT block size and a target throughput is elaborated. Finally synthesis results using fixed-point VHDL library and commercial IP are presented and compared with the proposed FFT processor. 相似文献

14.

New continuous-flow mixed-radix (CFMR) FFT Processor using novel in-place strategy 总被引：1，自引：0，他引：1

Jo B.G. Sunwoo M.H. 《IEEE transactions on circuits and systems. I, Regular papers》2005,52(5):911-919

The paper proposes a new continuous-flow mixed-radix (CFMR) fast Fourier transform (FFT) processor that uses the MR (radix-4/2) algorithm and a novel in-place strategy. The existing in-place strategy supports only a fixed-radix FFT algorithm. In contrast, the proposed in-place strategy can support the MR algorithm, which allows CF FFT computations regardless of the length of FFT. The novel in-place strategy is made by interchanging storage locations of butterfly outputs. The CFMR FFT processor provides the MR algorithm, the in-place strategy, and the CF FFT computations at the same time. The CFMR FFT processor requires only two N-word memories due to the proposed in-place strategy. In addition, it uses one butterfly unit that can perform either one radix-4 butterfly or two radix-2 butterflies. The CFMR FFT processor using the 0.18 /spl mu/m SEC cell library consists of 37,000 gates excluding memories, requires only 640 clock cycles for a 512-point FFT and runs at 100 MHz. Therefore, the CFMR FFT processor can reduce hardware complexity and computation cycles compared with existing FFT processors. 相似文献

15.

Conflict-Free Parallel Memory Accessing Techniques for FFT Architectures

《IEEE transactions on circuits and systems. I, Regular papers》2008,55(11):3438-3447

Speeding up fast Fourier transform (FFT) computations is critical for today's real-time systems targeting signal processing and telecommunication applications. Aiming at the performance improvement and the efficiency of FFT architectures, this paper presents an address generation technique which enables a radix-$b$ processor to access in parallel $b$ memory banks without conflicts during each stage's computations. Using $kb$ memory banks at each stage leads to increasing the speedup of the algorithm by a factor of $kb$ . The address generation can be realized in each radix-$b$ stage by the use of lookup tables of size $O(kb^{2})$ bits. The proposed technique is cost efficient and leads to the design of FFT architectures of high speedup and high sustained throughput. 相似文献

16.

Design and implementation of a fast algorithm for modulated lappedtransform

Jing C.Y. Tai H.-M. 《Vision, Image and Signal Processing, IEE Proceedings -》2002,149(1):27-32

A new fast algorithm for the computation of the modulated lapped transform (MLT) is proposed and its efficient implementation using pipelining techniques and complex programmable logic device (CPLD) is presented. The new algorithm computes a length-M MLT via the length-M/2 fast Fourier transform (FFT). Computational overhead due to data shuffling in pre-processing and post-processing is offset in hardware realisation. Hence the overall throughput of the MLT computation for real-time applications is significantly improved. The pipelined CPLD architecture and circuitry are described in detail. Computational complexity of the proposed algorithm is analysed, and throughput improvement is verified by experimental results 相似文献

17.

Static quantised radix-2 fast Fourier transform (FFT)/inverse FFT processor for constraints analysis

Rozita Teymourzadeh Memtode Jim Abigo Mok Vee Hoong 《International Journal of Electronics》2013,100(2):231-240

This research work focuses on the design of a high-resolution fast Fourier transform (FFT)/inverse FFT (IFFT) processors for constraints analysis purpose. Amongst the major setbacks associated with such high-resolution FFT processors are the high-power consumption resulting from the structural complexity and computational inefficiency of floating-point calculations. As such, a parallel pipelined architecture was proposed to statically scale the resolution of the processor to suite adequate trade-off constraints. The quantisation was applied to provide an approximation to address the finite word-length constraints of digital signal processing. An optimum operating mode was proposed, based on the signal-to-quantisation-noise ratio (SQNR) as well as the statistical theory of quantisation, to minimise the trade-off issues associated with selecting the most application-efficient floating-point processing capability in contrast to their resolution quality. 相似文献

18.

New radix-(2/spl times/2/spl times/2)/(4/spl times/4/spl times/4) and radix-(2/spl times/2/spl times/2)/(8/spl times/8/spl times/8) DIF FFT algorithms for 3-D DFT

Bouguezel S. Ahmad M.O. Swamy M.N.S. 《IEEE transactions on circuits and systems. I, Regular papers》2006,53(2):306-315

In this paper, new three-dimensional (3-D) radix-(2/spl times/2/spl times/2)/(4/spl times/4/spl times/4) and radix-(2/spl times/2/spl times/2)/(8/spl times/8/spl times/8) decimation-in-frequency (DIF) fast Fourier transform (FFT) algorithms are developed and their implementation schemes discussed. The algorithms are developed by introducing the radix-2/4 and radix-2/8 approaches in the computation of the 3-D DFT using the Kronecker product and appropriate index mappings. The butterflies of the proposed algorithms are characterized by simple closed-form expressions facilitating easy software or hardware implementations of the algorithms. Comparisons between the proposed algorithms and the existing 3-D radix-(2/spl times/2/spl times/2) FFT algorithm are carried out showing that significant savings in terms of the number of arithmetic operations, data transfers, and twiddle factor evaluations or accesses to the lookup table can be achieved using the radix-(2/spl times/2/spl times/2)/(4/spl times/4/spl times/4) DIF FFT algorithm over the radix-(2/spl times/2/spl times/2) FFT algorithm. It is also established that further savings can be achieved by using the radix-(2/spl times/2/spl times/2)/(8/spl times/8/spl times/8) DIF FFT algorithm. 相似文献

19.

A Low‐Complexity 128‐Point Mixed‐Radix FFT Processor for MB‐OFDM UWB Systems

Sang‐In Cho Kyu‐Min Kang 《ETRI Journal》2010,32(1):1-10

In this paper, we present a fast Fourier transform (FFT) processor with four parallel data paths for multiband orthogonal frequency‐division multiplexing ultra‐wideband systems. The proposed 128‐point FFT processor employs both a modified radix‐2⁴ algorithm and a radix‐2³ algorithm to significantly reduce the numbers of complex constant multipliers and complex booth multipliers. It also employs substructure‐sharing multiplication units instead of constant multipliers to efficiently conduct multiplication operations with only addition and shift operations. The proposed FFT processor is implemented and tested using 0.18 µm CMOS technology with a supply voltage of 1.8 V. The hardware‐ efficient 128‐point FFT processor with four data streams can support a data processing rate of up to 1 Gsample/s while consuming 112 mW. The implementation results show that the proposed 128‐point mixed‐radix FFT architecture significantly reduces the hardware cost and power consumption in comparison to existing 128‐point FFT architectures. 相似文献

20.

Applications of momentary Fourier transform in frequency shift keying signal demodulation

下载免费PDF全文

Zhonghui Chen Yiwen Xu Weixing Wang Xinxin Feng 《International Journal of Communication Systems》2014,27(10):2022-2029

Power line communication is attracting increasing attention for its characteristics of easy availability and low cost. Frequency shift keying with the virtue of antijamming capability is suitable for the low signal‐to‐noise ratio power line environment. The principle of the frequency shift keying communication system based on momentary Fourier transform (MFT) algorithm is provided first in this paper. Theoretically, the twiddle factor in the MFT algorithm is a decimal with infinite length. However, when it is applied on digital signal processor (DSP), the registers with finite length in DSP lead to the inaccuracy of the twiddle factor. Thus, the MFT algorithm with recursive form will cause accumulative error as a result. We analyze the impact of the inaccurate twiddle factor on the MFT algorithm from a novel point of view, by which the MFT algorithm is considered as a linear time invariant system. We prove that the accumulative error would not disperse when the modulus of the twiddle factor is smaller than 1. Simulation results show that the truncation approach is more effective than both carrying and rounding methods in a practical system. In addition, the number of significant digits of truncated twiddle factor must be chosen larger than 4 to overcome the problem of accumulative error.Copyright © 2012 John Wiley & Sons, Ltd. 相似文献