期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Multi-mode parallel and folded VLSI architectures for 1D-fast Fourier transform

《Integration, the VLSI Journal》2016

The modern real time applications like orthogonal frequency division multiplexing and etc., demand high performance fast Fourier transform (FFT) design with less area and clock cycles. This paper proposes efficient FFT VLSI architectures using folded/parallel implementation. In the proposed folded FFT architecture, the number of cycles required to complete the operation is less than single path delay feedback (SDF)/multi-path delay commutator (MDC) architectures. In the proposed parallel FFT architecture, N-point FFT is implemented by using one N/2-point FFT without much extra hardware. Both the proposed architectures are implemented for radix-2, 2², and 4 using 45 nm technology library. The proposed parallel architecture achieves 56.7% and 40.6% of area reduction as compared with the existing parallel architecture based 16-point radix-2 and radix-2² DIF FFTs respectively. The proposed folded architecture achieves 65.5%, 51.1%, and 35.8% of worst path delay reduction as compared with the existing SDF based 16-point radix-2, radix-2², and radix-4 DIF FFTs respectively. 相似文献

2.

Dynamic Partial Reconfigurable FFT for OFDM Based Communication Systems

C. Vennila G. Lakshminarayanan Seok-Bum Ko 《Circuits, Systems, and Signal Processing》2012,31(3):1049-1066

This paper presents a novel scalable and runtime dynamically reconfigurable FFT architecture for different wireless standards. With only 8 butterfly units, a reconfigurable FFT architecture for three different FFT points is realized using mixed radix-2²/2³/2⁴ FFT algorithm in a modified Single-path Delay Feedback (SDF) pipelined architecture. Via a proper data flow reconfiguration it can support 64, 128 and 256. It can even be extended up to 8192-point transforms and uses only 13 butterfly units to realize 8192 points. This paper describes the implementation method of 256 and 128 point FFT, which is reconfigured partially from 64 point FFT. The whole system is implemented on a Xilinx XC2VP30 FPGA device. The implementation design addresses area efficiency and flexibility allowing the insertion of the partial modules dynamically to realize various FFT sizes. To verify the efficacy of this dynamic partial reconfigurable FFT design method, a conventional multiplexer based reconfigurable architecture was designed and tested on the same platform. Tested FPGA results for the Dynamic Partial Reconfigurable (DPR) method show the configuration time improvement and good area efficiency as compared to the reconfigurable architecture using conventional multiplexer techniques. 相似文献

3.

An Area Efficient FFT/IFFT Processor for MIMO-OFDM WLAN 802.11n

Bo Fu Paul Ampadu 《Journal of Signal Processing Systems》2009,56(1):59-68

A pipelined Fast Fourier Transform and its inverse (FFT/IFFT) processor, which utilizes hardware resources efficiently, is proposed for MIMO-OFDM WLAN 802.11n. Compared with a conventional MIMO-OFDM implementation, (in which as many FFT/IFFT processors as the number of transmit/receive antennas is used), the proposed architecture (using hardware sharing among multiple data sequences) reduces hardware complexity without sacrificing system throughput. Further, the proposed architecture can support 1–4 input data sequences with sequence lengths of 64 or 128, as needed. The FFT/IFFT processor is synthesized using TSMC 0.18 um CMOS technology and saves 25% area compared to a conventional implementation approach using radix-2³ algorithm. The proposed FFT/IFFT processor can be configured to improve power efficiency according to the number of input data sequences and the sequence length. The processor consumes 38 mW at 75 MHz for one input sequence with 64-point length; it consumes 87 mW at 75 MHz for four input sequences with length 128-point and can be efficiently used for IEEE 802.11n WLAN standard.

Paul AmpaduEmail:

相似文献

4.

An area-efficient and low-power 64-point pipeline Fast Fourier Transform for OFDM applications

《Integration, the VLSI Journal》2017

In an orthogonal frequency division multiplexing (OFDM) based wireless systems, Fast Fourier Transform (FFT) is a critical block as it occupies large area and consumes more power. In this paper, we present an area-efficient and low power 16-bit word-width 64-point radix-2² and radix-2³ pipelined FFT architectures for an OFDM-based IEEE 802.11a wireless LAN baseband. The designs are derived from radix-2^k algorithm and adopt a Single-Path Delay Feedback (SDF) architecture for hardware implementation. To eliminate the complex multipliers and read-only memory (ROM) which is used for internal storage of twiddle factor coefficients, the proposed 64-point FFT employs a Canonical Signed Digit (CSD) complex constant multiplier using adders, multiplexers and shifters. The complex constant multiplier (CCM) is modified using common sub-expression sharing block that reduces the area of the design. The proposed radix-2² and radix-2³ pipelined FFT architectures are modeled and implemented using TSMC 180 nm CMOS technology with a supply voltage of 1.8 V. The implementation results show that the proposed architectures significantly reduces the hardware cost and power consumption in comparison to existing 64-point FFT architectures. 相似文献

5.

A multi‐mode IFFT/FFT processor for IEEE 802.11ac: design and implementation

Abdelmohsen Ali Walaa Hamouda 《Wireless Communications and Mobile Computing》2016,16(13):1713-1725

In this paper, we present 64/128/256/512‐point inverse fast Fourier transform (IFFT)/FFT processor for single‐user and multi‐user multiple‐input multiple‐output orthogonal frequency‐division multiplexing based IEEE 802.11ac wireless local area network transceiver. The multi‐mode processor is developed by an eight‐parallel mixed‐radix architecture to efficiently produce full reconfigurability for all multi‐user combinations. The proposed design not only supports the operation of IFFT/FFT for 1–8 different data streams operated by different users in case of downlink transmission, but also, it provides different throughput rates to meet IEEE 802.11ac requirements at the minimum possible clock frequency. Moreover, less power is needed in our design compared with traditional software approach. The design is carefully optimized to operate by the minimum wordlengths that fulfill the performance and complexity specifications. The processor is designed and implemented on Xilinx Vertix‐5 field programmable gate array technology. Although the maximum clock frequency is 377.84 MHz, the processor is clocked by the operating sampling rate to further reduce the power consumption. At the operation clock rate of 160 MHz, our proposed processor can calculate 512‐point FFT with up to eight independent data sequences within 3.2~μs meeting IEEE 802.11ac standard requirements. Copyright © 2015 John Wiley & Sons, Ltd. 相似文献

6.

A dynamic scaling FFT processor for DVB-T applications 总被引：1，自引：0，他引：1

Yu-Wei Lin Hsuan-Yu Liu Chen-Yi Lee 《Solid-State Circuits, IEEE Journal of》2004,39(11):2005-2013

This paper presents an 8192-point FFT processor for DVB-T systems, in which a three-step radix-8 FFT algorithm, a new dynamic scaling approach, and a novel matrix prefetch buffer are exploited. About 64 K bit memory space can be saved in the 8 K point FFT by the proposed dynamic scaling approach. Moreover, with data scheduling and pre-fetched buffering, single-port memory can be adopted without degrading throughput rate. A test chip for 8 K mode DVB-T system has been designed and fabricated using 0.18-/spl mu/m single-poly six-metal CMOS process with core area of 4.84 mm/sup 2/. Power dissipation is about 25.2 mW at 20 MHz. 相似文献

7.

基于Radix-4实现高速Viterbi译码器设计

马力陈泳恩《通信技术》2002,(3):13-14

使用一种新的Viterbi译码器设计方法来达到高速率、低功耗设计。在传统Viterbi译码器中,ACS(add-compare-select)单元是基于radix-2网格设计的,而这里将介绍一种新的ACS设计方法,即基于radix-4网格的ACS单元设计。每个这样的ACS单元将有4路输入,即在每个时钟周期能够处理两级传统的基于radix-2设计的两级网格。同时在这里的Viterbi译码器设计中采用了Top-To-Down设计思想,用Verilog语言来描述RTL电路层。并用QuartusII软件进行电路仿真和综合。用本算法在33.333MHz时钟下实观在Altera公司的APEX20KFPGA的64状态Viterbi译码器译码速率可达8Mbps以上,且仅占用很小的硬件资源。采用此方法设计的高速Viterbi解码器SoftIPCore可应用于需要高速,低功耗译码的多媒体移动通讯上。相似文献

8.

An Area-Efficient Design of Variable-Length Fast Fourier Transform Processor

Shuenn-Shyang Wang Chien-Sung Li 《Journal of Signal Processing Systems》2008,51(3):245-256

Fast Fourier transform (FFT) plays an important role in the orthogonal frequency division multiplexing (OFDM) communication systems. In this paper, we propose an area-efficient design of variable-length FFT processor which can perform various FFT lengths of 512/1,024/2,048/4,096/8,192 points used in OFDM-based communication systems, such as digital audio broadcasting (DAB), digital video broadcasting-terrestrial (DVB-T) and digital video broadcasting-handheld (DVB-H). To reduce computational complexity and chip area, we develop a new variable-length FFT architecture by devising a mixed-radix algorithm that consist of radix-2, radix-2² and radix-2/4/8 algorithms and optimizing the realization by substructure sharing. Based on this architecture, an area-efficient design of variable-length FFT processor is presented. By synthesized using the UMC 0.18 μm process, the area of the processor is 2.9 mm² and the 8,192-point FFT can be performed correctly up to 50 MHz with power consumption 823 mW under a 1.8 V supply voltage.

Shuenn-Shyang WangEmail:

相似文献

9.

An expandable column fft architecture using circuit switching networks

Tom Chen Li Zhu 《The Journal of VLSI Signal Processing》1993,6(3):243-257

The Fast Fourier Transform (FFT) is widely used in various digital signal processing applications. The performance requirements for FFT in modern real-time applications has increased dramatically due to the high demand on capacity and performance of modern telecommunication systems, where FFT plays a major role. Software implementations of FFT running on a general purpose computer can no longer meet current speed requirements. However, recent advances in VLSI technology have made it possible to implement the entire FFT system on a single silicon substrate. This article presents a column FFT design suitable for ULSI (Ultra Large Scale Integration) implementations. The basic building block is a 64-point column FFT. FFTs with longer transform lengths can be easily realized using the 64-point column FFT building block. The butterfly processors in the column FFT are connected using circuit switching networks. The circuit switching networks not only provide dynamically recon-figurable interconnections among the butterfly processors, but also provide a fault-tolerant capability. Bit-serial arithmetic is used in the architecture. Assuming the data word length is 16 bits, the 1024-point column FFT engine proposed in this article is capable of processing 1024 complex data samples in 533 clock cycles. If the clock frequency is 40 MHz, it will take 13.3 µs to complete a 1024-point FFT. 相似文献

10.

Cost-Effective Triple-Mode Reconfigurable Pipeline FFT/IFFT/2-D DCT Processor

Chin-Teng Lin Yuan-Chu Yu Lan-Da Van 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2008,16(8):1058-1071

This investigation proposes a novel radix-4² algorithm with the low computational complexity of a radix-16 algorithm but the lower hardware requirement of a radix-4 algorithm. The proposed pipeline radix-4² single delay feedback path (R4²SDF) architecture adopts a multiplierless radix-4 butterfly structure, based on the specific linear mapping of common factor algorithm (CFA), to support both 256-point fast Fourier transform/inverse fast Fourier transform (FFT/IFFT) and 8times8 2D discrete cosine transform (DCT) modes following with the high efficient feedback shift registers architecture. The segment shift register (SSR) and overturn shift register (OSR) structure are adopted to minimize the register cost for the input re-ordering and post computation operations in the 8times8 2D DCT mode, respectively. Moreover, the retrenched constant multiplier and eight-folded complex multiplier structures are adopted to decrease the multiplier cost and the coefficient ROM size with the complex conjugate symmetry rule and subexpression elimination technology. To further decrease the chip cost, a finite wordlength analysis is provided to indicate that the proposed architecture only requires a 13-bit internal wordlength to achieve 40-dB signal-to-noise ratio (SNR) performance in 256-point FFT/IFFT modes and high digital video (DV) compression quality in 8 times 8 2D DCT mode. The comprehensive comparison results indicate that the proposed cost effective reconfigurable design has the smallest hardware requirement and largest hardware utilization among the tested architectures for the FFT/IFFT computation, and thus has the highest cost efficiency. The derivation and chip implementation results show that the proposed pipeline 256-point FFT/IFFT/2D DCT triple-mode chip consumes 22.37 mW at 100 MHz at 1.2-V supply voltage in TSMC 0.13-mum CMOS process, which is very appropriate for the RSoCs IP of next-generation handheld devices. 相似文献

11.

一种基于FPGA的高性能FFT处理器设计 总被引：1，自引：0，他引：1

张傲华张正鸿尧德中《电子信息对抗技术》2005,20(4):44-47

FFT算法是高速实时信号处理的关键算法之一,在数字EW接收机中有着广泛的应用前景。本文基于Xilinx公司的Vertex-IIPro系列FPGA,设计一种级联结构的1024点FFT处理器,采用基-4并行蝶算单元,能并行处理四路输入数据,极大地提高了FFT的处理速度。在系统时钟为100MHz时,完成1024点复数FFT运算仅需要2.56μs。相似文献

12.

A 2.4-Gsample/s DVFS FFT Processor for MIMO OFDM Communication Systems 总被引：1，自引：0，他引：1

Yuan Chen Yu-Wei Lin Yu-Chi Tsao Chen-Yi Lee 《Solid-State Circuits, IEEE Journal of》2008,43(5):1260-1273

This paper presents a new dynamic voltage and frequency scaling (DVFS) FFT processor for MIMO OFDM applications. By the proposed multimode multipath-delay-feedback (MMDF) architecture, our FFT processor can process 1-8-stream 256-point FFTs or a high-speed 256-point FFT in two processing domains at minimum clock frequency for DVFS operations. A parallelized radix-2⁴ FFT algorithm is also employed to save the power consumption and hardware cost of complex multipliers. Furthermore, a novel open-loop voltage detection and scaling (OLVDS) mechanism is proposed for fast and robust voltage management. With these schemes, the proposed FFT processor can operate at adequate voltage/frequency under different configurations to support the power-aware feature. A test chip of the proposed FFT processor has been fabricated using UMC 90 nm single-poly nine-metal CMOS process with a core area of 1.88 times1.88 mm² . The SQNR performance of this FFT chip is over 35.8 dB for QPSK/16-QAM modulation. Power dissipation of 2.4 Gsample/s 256-point FFT computations is about 119.7 mW at 0.85 V. Depending on the operation mode, power can be saved by 18%-43% with voltage scaling in TT corner. 相似文献

13.

Design of an FFT/IFFT Processor for MIMO OFDM Systems

Lin Y.-W. Lee C.-Y. 《IEEE transactions on circuits and systems. I, Regular papers》2007,54(4):807-815

In this paper, we present a novel 128/64 point fast Fourier transform (FFT)/ inverse FFT (IFFT) processor for the applications in a multiple-input multiple-output orthogonal frequency-division multiplexing based IEEE 802.11n wireless local area network baseband processor. The unfolding mixed-radix multipath delay feedback FFT architecture is proposed to efficiently deal with multiple data sequences. The proposed processor not only supports the operation of FFT/IFFT in 128 points and 64 points but can also provide different throughput rates for 1-4 simultaneous data sequences to meet IEEE 802.11n requirements. Furthermore, less hardware complexity is needed in our design compared with traditional four-parallel approach. The proposed FFT/IFFT processor is designed in a 0.13-mum single-poly and eight-metal CMOS process. The core area is 660times2142 mum², including an FFT/IFFT processor and a test module. At the operation clock rate of 40 MHz, our proposed processor can calculate 128-point FFT with four independent data sequences within 3.2 mus meeting IEEE 802.11n standard requirements 相似文献

14.

基于FPGA的移位寄存器流水线结构FFT处理器设计与实现

郝小龙韦高刘娜《现代电子技术》2010,33(9):172-176

设计实现了基于FPGA的256点定点FFT处理器。处理器以基-2算法为基础,通过采用高效的两路输入移位寄存器流水线结构,有效提高了碟形运算单元的运算效率,减少了寄存器资源的使用,提高了最大工作频率,增大了数据吞吐量,并且使得处理器具有良好的可扩展性。详细描述了具体设计的算法结构和各个模块的实现。设计采用Verilog HDL作为硬件描述语言,采用QuartusⅡ设计仿真工具进行设计、综合和仿真,仿真结果表明,处理器工作频率为72 MHz,是一种高效的FFT处理器IP核。相似文献

15.

一种8点FFT算法的逻辑电路实现 总被引：3，自引：0，他引：3

孙飞周宁孙亚楠郑南宁《微电子学与计算机》2002,19(11):5-17,10

文章讲述了一种8点FFT的verilog HDL设计实现，着重阐述了有符号的复数加／减法器的设计，并给出了整个系统仿真的结果，该设计可以在4个时钟内计算一个数据，最后该系统在Altera公司在FLEX10K的FPGA上成功地实现了综合，实验结果证明这种设计方法不但具有设计方法简单的优点，而且吞吐量可以达到2M／s的变换速度，已经完全能够满足目前一些系统的要求（如OFDM系统）。相似文献

16.

高吞吐率双模浮点可重构FFT处理器设计实现

魏星黄志洪杨海钢《电子与信息学报》2018,40(12):3042-3050

高吞吐浮点可灵活重构的快速傅里叶变换(FFT)处理器可满足尖端雷达实时成像和高精度科学计算等多种应用需求。与定点FFT相比,浮点运算复杂度更高,使得浮点型FFT的运算吞吐率与其实现面积、功耗之间的矛盾问题尤为突出。鉴于此,为降低运算复杂度,首先将大点数FFT分解成若干个小点数基2k 级联子级实现,提出分别针对128/256/512/1024/2048点FFT的优化混合基算法。同时,结合所提出同时支持单通道单精度和双通道半精度两种浮点模式的新型融合加减与点乘运算单元,首次提出一款高吞吐率双模浮点可变点FFT处理器结构,并在28 nm标准CMOS工艺下进行设计并实现。实验结果表明,单通道单精度和双通道半精度浮点两种模式下的运算吞吐率和输出平均信号量化噪声比分别为3.478 GSample/s, 135 dB和6.957 GSample/s, 60 dB。归一化吞吐率面积比相比于现有其他浮点FFT实现可提高约12倍。相似文献

17.

WFTA算法的FPGA设计与实现

魏鹏孙磊王华力《通信技术》2011,44(4):167-169

Winograd傅里叶变换算法（WFTA）利用旋转因子W的特性对其进行分解,能够把FFT运算中乘法次数降到最低,是一种高效且资源占用相对较少的FFT实现方法。以256点分解为两维16×16点的小数组WFTA进行运算为例介绍了大数组WFTA算法的FPGA设计与实现方案。仿真测试表明,所设计的256点FFT处理器,乘法器资源消耗仅为基-2FFT的1/2、基-4FFT的2/3,且在100 MHz主时钟频率下完成运算仅需5.8μs,满足FFT处理器的高速实时性要求。相似文献

18.

基于改进FFT算法的OFDM调制/解调模块设计 总被引：4，自引：4，他引：0

廖巨华陈岚《微电子学与计算机》2005,22(6):57-60

文章对传统FFT算法进行了改进,改进后的算法将N点DFT分解成二维√N点DFT的组合,在结构上更适合于用流水线方式实现FFT.文章首先对算法进行了推导,然后基于该算法设计了一个64点、32位字长的定点IFFT/FFT模块,用于802.11a中OFDM的调制/解调.与传统的流水线FFT比较,该模块中的复数乘法运算全部采用移位相加操作完成,因而消除了乘法器及旋转因子ROM的使用,降低了功耗.最后,对该模块进行了验证仿真.结果表明,在流水线饱和的情况下,该模块完成一个64点的FFT运算只需要8个时钟周期,在20MHZ时钟频率下,该模块的功耗为0.26W,完全能满足移动通信中对于高速度、低功耗的要求. 相似文献

19.

New continuous-flow mixed-radix (CFMR) FFT Processor using novel in-place strategy 总被引：1，自引：0，他引：1

Jo B.G. Sunwoo M.H. 《IEEE transactions on circuits and systems. I, Regular papers》2005,52(5):911-919

The paper proposes a new continuous-flow mixed-radix (CFMR) fast Fourier transform (FFT) processor that uses the MR (radix-4/2) algorithm and a novel in-place strategy. The existing in-place strategy supports only a fixed-radix FFT algorithm. In contrast, the proposed in-place strategy can support the MR algorithm, which allows CF FFT computations regardless of the length of FFT. The novel in-place strategy is made by interchanging storage locations of butterfly outputs. The CFMR FFT processor provides the MR algorithm, the in-place strategy, and the CF FFT computations at the same time. The CFMR FFT processor requires only two N-word memories due to the proposed in-place strategy. In addition, it uses one butterfly unit that can perform either one radix-4 butterfly or two radix-2 butterflies. The CFMR FFT processor using the 0.18 /spl mu/m SEC cell library consists of 37,000 gates excluding memories, requires only 640 clock cycles for a 512-point FFT and runs at 100 MHz. Therefore, the CFMR FFT processor can reduce hardware complexity and computation cycles compared with existing FFT processors. 相似文献

20.

A Gbps IPSec SSL Security Processor Design and Implementation in an FPGA Prototyping Platform

Haixin Wang Guoqiang Bai Hongyi Chen 《Journal of Signal Processing Systems》2010,58(3):311-324

相似文献