首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 140 毫秒
1.
俞中英  朱恩   《电子器件》2007,30(6):2028-2031
基于TSMC0.18μm CMOS工艺标准单元库,设计了高速1024点FFT处理器。数据采用IEEE754标准单精度浮点格式,实现高精度数据处理;在设计中通过使用改进的按时间抽取的基二算法,降低了寻址的复杂度;采用流水线技术设计了蝶形运算单元,提高了系统的工作频率;利用三角函数关系,提出了新的旋转因子存储方案,相比于传统设计,可以使ROM规模降低75%。逻辑综合和版图综合后的报告显示,该处理器的工作频率可以达到167MHz,完成一次1024点FFT运算仅需37.7μs,FFT处理单元核心面积为1.4mm2.  相似文献   

2.
高速图像处理、高速信号分析系统中需要超长点比如64K点、250兆吞吐速率的FFT运算。对于FFT运算不仅要求点数超长而且对精度也提出了一定的要求,以往结构中普遍采用的块浮点结构因其点数的增多,精度随之下降。为了能够达到一定的精度,提出了纯浮点运算结构,其中FFT硬件实现主要基于基16算法结构,描述了超长点高速FFT的硬件实现,包括基16核心运算结构、RAM的划分、旋转因子的产生和纯浮点运算结构。其FPGA性能能够达到250MSPS。  相似文献   

3.
本文提出了一种利用Radix-22 FFT算法实现的可配置点数的FFT处理器硬件实现结构。Radix-22 FFT算法最后一级碟形运算单元可以选择碟形1或者碟形1/碟形2,从而可以完成任意2n点FFT运算。据此提出可配置点数的FFT硬件结构,采用串行流水线单路延时置换结构,完成2048~256序列点数的可配置FFT处理器ASIC设计。芯片测试结果验证了基于Radix-22算法的可配置点数FFT硬件结构可以完成4096~256点数频谱分析,4096点FFT计算时间少于90.02us,运算精度SQNR可以达到50.65dB,满足运用需求。  相似文献   

4.
应用于超宽带系统中的低功耗、高速FFT/IFFT处理器设计   总被引:1,自引:0,他引:1  
设计了一种应用于超宽带(UWB)无线通信系统中的FFT/IFFT处理器.采用8×8×2混合基算法进行FFT运算,实现了2路64点或者1路128点FFT功能,并为该算法提出了一种新型的8路并行反馈结构.该结构提高了处理器的数据吞吐率,降低了芯片功耗.为了减少处理器中的乘法数目,提高时序性能,提出了改进型移位加算法.设计的FFT/IFFT处理器采用SMIC 0.13μm CMOS工艺制造,芯片的核心面积为1.44mm2.测试结果表明,该芯片最高数据吞吐率到达1Gsample/s,在典型的工作频率500Msample/s下,芯片功耗为39.6mW.与现有同类型FFT芯片相比,该芯片面积缩小了40%,功耗减少了45%.  相似文献   

5.
《信息技术》2017,(4):61-64
文中首先讨论了多种FFT算法及其基本原理,实现了基2频率抽取算法,采用单蝶形顺序处理的结构实现单精度浮点数FFT处理器。根据自顶向下的设计思想,将整个设计划分为6个子模块,分别对子模块进行设计,最后组合成FFT处理器。然后,文中介绍了浮点数加法器和浮点数乘法器的硬件实现,在其中引入流水线,大大提高了数据吞吐量,提高处理速度。在中间结果缓存单元的设计中,调用Altera IP Core中的三口RAM,能够同时读写数据,大大节省了运算时间。最后对FFT处理器进行了功能仿真和时序仿真,做了详尽的分析测试。结果表明,单精度浮点数FFT处理器达到了较高的运算精度,可稳定运行在62.5MHz,完成一次256点浮点数复数FFT运算需要33.056μs。与DSP和单片机实现的FFT相比,在性能上具有一定优势。  相似文献   

6.
设计了一种应用于802.11a的64点FFT/IFFT处理器.采用单蝶形4路并行结构,提出了4路并行无冲突地址产生方法,有效地提高了吞吐率,完成64点FFT/IFFT运算只需63个时钟周期.提出的RAM双乒乓结构实现了对输入和输出均为连续数据流的缓存处理.不仅能实现64点FFT和IFFT,而且位宽可以根据系统任意配置.为了提高数据运算的精度,设计采用了块浮点算法,实现了精度与资源的折中.16位位宽时,在HJTC 0.18μmCMOS工艺下综合,内核面积为:0.626 7 mm2,芯片面积为:1.35 mm×1.27 mm,最高工作频率可达300 MHz,功耗为126.17 mW.  相似文献   

7.
提出了一种可配置高精度FFT/IFFT处理器的设计.设计中采用单蝶形混合基串行结构,降低了系统的复杂性,节省了一定的资源.提出了一种新颢的块浮点算法,有效避免了溢出问题并且提高了精度.运算点数可以通过对产生地址计数器的位选择配置为64、128、256、512、1024,实部、虚部均为16bit数据,不仅可以实现FFT运算,还可以实现IFFT运算.在SMIC0.13μm CMOS工艺下综合的面积为1.55mm<'2>,最高频率为210MHz.测试结果显示了本设计的高精度特性.  相似文献   

8.
介绍了一种基于现场可编程门阵列(FPGA)的低功耗可配置浮点快速傅里叶变换(FFT)处理器的设计,可进行4点、16点、64点以及256点运算。采用按频率抽取的基–4算法和基于存储器的单蝶形结构。对蝶形运算单元进行优化,减少乘法器的数目,降低了功耗。存储单元采用乒乓存储结构,提高了数据的吞吐率。同时,采用浮点运算提高了处理器的运算精确度。该处理器采用中芯国际(SMIC)0.18 μm工艺库进行综合,功耗为0.82 mW/MHz,并在ACX1329-CSG324 FPGA上实现。  相似文献   

9.
王祯  韩泽耀 《信息技术》2008,32(3):34-37
提出了一种新的基于基23算法单路径反馈流水线结构的FFT处理器.通过对数据通路的动态调整,解除了变换点数必须是8的幂次的限制,可高效实现任意2n点的FFT/IFFT变换.并将自定义浮点格式引入流水线,同时在流水线输入端添加预处理单元,在不引入过多逻辑的情况下,有效的提高了FFT的变换精度,同时存储器的使用量降低10%.  相似文献   

10.
基于FPGA的可扩展高速FFT处理器的设计与实现   总被引:3,自引:1,他引:2  
刘晓明  孙学 《电讯技术》2005,45(3):147-151
本文提出了基于FPGA实现傅里叶变换点数可灵活扩展的流水线FFT处理器的结构设计以及各功能模块的算法实现,包括高组合数FFT算法的流水线实现结构、级间混序读/写RAM地址规律、短点数FFT阵列处理结构以及补码实现CORDIC算法的流水线结构等。利用FPGA实现的各功能模块组装了64点FFT处理器。从其计算性能可知,在输入数据速率为20MHz时,利用此结构实现的FFT处理器计算1024点FFT的运算时间约为52μs。  相似文献   

11.
This paper presents a high throughput size-configurable floating point (FP) Fast Fourier Transform (FFT) processor, having implemented the 8-parallel multi-path delay feedback (MDF) functions suitable for applications in the real-time radar imaging system. With regard to floating-point FFT design, to acquire a high throughput with restricted area and power consumptions poses as a greater challenge due to some higher degrees of complexity involved in realizing of FP operations than those fixed-point counterparts. To address the related issues, a novel mixed-radix FFT algorithm featuring the single-sided binary-tree decomposition strategy is proposed aiming at effectively containing the complexity of multiplications for any 2k-point FFT. To this aid, the parallel-processing twiddle factor generator and the dual addition-and-rounding fused FP arithmetic units are optimized to meet the high accuracy demand in computation and the low power budget in implementation. The proposed FP FFT processor has been designed in silicon based on SMIC's 28 nm CMOS technology with the active area of 1.39 mm2. The prototype design delivers a throughput of 4 GSample/s at 500 MHz, at a peak power consumption of 84.2 mW. Thus, the proposed design approach achieves a significant improvement in power efficiency approximately by 14 times on average over some other FP FFT processors previously reported.  相似文献   

12.
This paper proposes architectures for dual-mode and tri-mode dynamically configurable multiplier for quadruple precision arithmetic. The proposed dual-mode QPdDP multiplier architectures can either compute on a pair of quadruple precision (QP) operands or provide SIMD support for two-parallel (dual) sets of double precision (DP) operands. The proposed tri-mode QPdDPqSP multiplier architectures are aimed to include the four-parallel (quad) single precision (SP) along with dual-DP and a QP operand processing. For the underlying largest sub-component, the mantissa multiplier, two methods are analyzed to design the dual-mode/tri-mode architectures. One is based on the Karatsuba method, and in another a dual-mode/tri-mode Radix-4 Modified Booth (MB) multiplier is proposed. The proposed dual-mode/tri-mode MB multiplier requires few extra 2:1 MUXs as an overhead compared to a simple MB multiplier. To support dual-mode/tri-mode functioning other important sub-components of the FP multiplication are also re-designed for multi-mode support. The proposed architectures are synthesized using UMC 90 nm ASIC technology, and are compared against prior literature in terms of area, period, and a unified metric “Area (Gate Count) × Period (FO4) × Latency × Throughput (in cycles)”. The dual-mode/tri-mode FP architectures with MB mantissa multipliers shows better timings, however, those with Karatsuba mantissa multipliers acquires smaller area.  相似文献   

13.
基于FPGA的32位浮点FFT处理器的设计   总被引:5,自引:3,他引:5  
介绍了一种基于FPGA的1024点32位浮点FFT处理器的设计。采用改进的蝶形运算单元,减小了系统的硬件消耗,改善了系统的性能。详细讨论了32位浮点加法器/减法器、乘法器的分级流水技术,提高了系统性能。浮点算法的采用使得系统具有较高的处理精度。  相似文献   

14.
In this brief, a high-throughput and low-complexity fast Fourier transform (FFT) processor for wideband orthogonal frequency division multiplexing communication systems is presented. A new indexed-scaling method is proposed to reduce both the critical-path delay and hardware cost by employing shorter wordlength. Together with the mixed-radix multipath delay feedback structure, the proposed FFT processor can achieve very high throughput with low hardware cost. From analysis, it is shown that the proposed indexed-scaling method can save at least 11% memory utilizations compared to other state-of-the-art scaling algorithms. Also, a test chip of a 1.2 Gsample/s 2048-point FFT processor has been designed using UMC 90-nm 1P9M process with a core area of 0.97 mm2. The signal-to-quantization-noise ratio (SQNR) performance of this test chip is over 32.7 dB to support 16-QAM modulation and the power consumption is about 117 mW at 300 MHz. Compared to the fixed-point FFT processors, about 26% area and 28% power can be saved under the same throughput and SQNR specifications.  相似文献   

15.
描述了一种高效的FFT(fast Fourier transform)流水线结构,采用这种流水线结构不仅能提高数据速率,而且能有效减小设计的规模.作为OFDM(orthogonal frequency division multiplexing)系统实现的关键部分,FFT的设计关系到整个系统的实现规模.作为应用之一,笔者在DVB-T接收机中采用了这种FFT结构,实现了对2K/8K双模式的解调.该结构还可方便地应用到其他应用FFT的场合,且易于实现多种模式的并存.  相似文献   

16.
A 1-GS/s FFT/IFFT processor for UWB applications   总被引:1,自引:0,他引:1  
In this paper, we present a novel 128-point FFT/IFFT processor for ultrawideband (UWB) systems. The proposed pipelined FFT architecture, called mixed-radix multipath delay feedback (MRMDF), can provide a higher throughput rate by using the multidata-path scheme. Furthermore, the hardware costs of memory and complex multipliers in MRMDF are only 38.9% and 44.8% of those in the known FFT processor by means of the delay feedback and the data scheduling approaches. The high-radix FFT algorithm is also realized in our processor to reduce the number of complex multiplications. A test chip for the UWB system has been designed and fabricated using 0.18-/spl mu/m single-poly and six-metal CMOS process with a core area of 1.76/spl times/1.76 mm/sup 2/, including an FFT/IFFT processor and a test module. The throughput rate of this fabricated FFT processor is up to 1 Gsample/s while it consumes 175 mW. Power dissipation is 77.6 mW when its throughput rate meets UWB standard in which the FFT throughput rate is 409.6 Msample/s.  相似文献   

17.
In this paper, we present a novel 128/64 point fast Fourier transform (FFT)/ inverse FFT (IFFT) processor for the applications in a multiple-input multiple-output orthogonal frequency-division multiplexing based IEEE 802.11n wireless local area network baseband processor. The unfolding mixed-radix multipath delay feedback FFT architecture is proposed to efficiently deal with multiple data sequences. The proposed processor not only supports the operation of FFT/IFFT in 128 points and 64 points but can also provide different throughput rates for 1-4 simultaneous data sequences to meet IEEE 802.11n requirements. Furthermore, less hardware complexity is needed in our design compared with traditional four-parallel approach. The proposed FFT/IFFT processor is designed in a 0.13-mum single-poly and eight-metal CMOS process. The core area is 660times2142 mum2 , including an FFT/IFFT processor and a test module. At the operation clock rate of 40 MHz, our proposed processor can calculate 128-point FFT with four independent data sequences within 3.2 mus meeting IEEE 802.11n standard requirements  相似文献   

18.
VLSI implementation of MIMO detection using the sphere decoding algorithm   总被引:3,自引:0,他引:3  
Multiple-input multiple-output (MIMO) techniques are a key enabling technology for high-rate wireless communications. This paper discusses two ASIC implementations of MIMO sphere decoders. The first ASIC attains maximum-likelihood performance with an average throughput of 73 Mb/s at a signal-to-noise ratio (SNR) of 20 dB; the second ASIC shows only a negligible bit-error-rate degradation and achieves a throughput of 170 Mb/s at the same SNR. The three key contributing factors to high throughput and low complexity are: depth-first tree traversal with radius reduction, implemented in a one-node-per-cycle architecture, the use of the /spl lscr//sup /spl infin//-instead of /spl lscr//sup 2/-norm, and, finally, the efficient implementation of the enumeration approach recently proposed in . The resulting ASICs currently rank among the fastest reported MIMO detector implementations.  相似文献   

19.
This research work focuses on the design of a high-resolution fast Fourier transform (FFT)/inverse FFT (IFFT) processors for constraints analysis purpose. Amongst the major setbacks associated with such high-resolution FFT processors are the high-power consumption resulting from the structural complexity and computational inefficiency of floating-point calculations. As such, a parallel pipelined architecture was proposed to statically scale the resolution of the processor to suite adequate trade-off constraints. The quantisation was applied to provide an approximation to address the finite word-length constraints of digital signal processing. An optimum operating mode was proposed, based on the signal-to-quantisation-noise ratio (SQNR) as well as the statistical theory of quantisation, to minimise the trade-off issues associated with selecting the most application-efficient floating-point processing capability in contrast to their resolution quality.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号