首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A 2.4-Gsample/s DVFS FFT Processor for MIMO OFDM Communication Systems   总被引:1,自引:0,他引:1  
This paper presents a new dynamic voltage and frequency scaling (DVFS) FFT processor for MIMO OFDM applications. By the proposed multimode multipath-delay-feedback (MMDF) architecture, our FFT processor can process 1-8-stream 256-point FFTs or a high-speed 256-point FFT in two processing domains at minimum clock frequency for DVFS operations. A parallelized radix-24 FFT algorithm is also employed to save the power consumption and hardware cost of complex multipliers. Furthermore, a novel open-loop voltage detection and scaling (OLVDS) mechanism is proposed for fast and robust voltage management. With these schemes, the proposed FFT processor can operate at adequate voltage/frequency under different configurations to support the power-aware feature. A test chip of the proposed FFT processor has been fabricated using UMC 90 nm single-poly nine-metal CMOS process with a core area of 1.88 times1.88 mm2 . The SQNR performance of this FFT chip is over 35.8 dB for QPSK/16-QAM modulation. Power dissipation of 2.4 Gsample/s 256-point FFT computations is about 119.7 mW at 0.85 V. Depending on the operation mode, power can be saved by 18%-43% with voltage scaling in TT corner.  相似文献   

2.
A 1-GS/s FFT/IFFT processor for UWB applications   总被引:1,自引:0,他引:1  
In this paper, we present a novel 128-point FFT/IFFT processor for ultrawideband (UWB) systems. The proposed pipelined FFT architecture, called mixed-radix multipath delay feedback (MRMDF), can provide a higher throughput rate by using the multidata-path scheme. Furthermore, the hardware costs of memory and complex multipliers in MRMDF are only 38.9% and 44.8% of those in the known FFT processor by means of the delay feedback and the data scheduling approaches. The high-radix FFT algorithm is also realized in our processor to reduce the number of complex multiplications. A test chip for the UWB system has been designed and fabricated using 0.18-/spl mu/m single-poly and six-metal CMOS process with a core area of 1.76/spl times/1.76 mm/sup 2/, including an FFT/IFFT processor and a test module. The throughput rate of this fabricated FFT processor is up to 1 Gsample/s while it consumes 175 mW. Power dissipation is 77.6 mW when its throughput rate meets UWB standard in which the FFT throughput rate is 409.6 Msample/s.  相似文献   

3.
高吞吐浮点可灵活重构的快速傅里叶变换(FFT)处理器可满足尖端雷达实时成像和高精度科学计算等多种应用需求。与定点FFT相比,浮点运算复杂度更高,使得浮点型FFT的运算吞吐率与其实现面积、功耗之间的矛盾问题尤为突出。鉴于此,为降低运算复杂度,首先将大点数FFT分解成若干个小点数基2k 级联子级实现,提出分别针对128/256/512/1024/2048点FFT的优化混合基算法。同时,结合所提出同时支持单通道单精度和双通道半精度两种浮点模式的新型融合加减与点乘运算单元,首次提出一款高吞吐率双模浮点可变点FFT处理器结构,并在28 nm标准CMOS工艺下进行设计并实现。实验结果表明,单通道单精度和双通道半精度浮点两种模式下的运算吞吐率和输出平均信号量化噪声比分别为3.478 GSample/s, 135 dB和6.957 GSample/s, 60 dB。归一化吞吐率面积比相比于现有其他浮点FFT实现可提高约12倍。  相似文献   

4.
In this paper, we present a novel 128/64 point fast Fourier transform (FFT)/ inverse FFT (IFFT) processor for the applications in a multiple-input multiple-output orthogonal frequency-division multiplexing based IEEE 802.11n wireless local area network baseband processor. The unfolding mixed-radix multipath delay feedback FFT architecture is proposed to efficiently deal with multiple data sequences. The proposed processor not only supports the operation of FFT/IFFT in 128 points and 64 points but can also provide different throughput rates for 1-4 simultaneous data sequences to meet IEEE 802.11n requirements. Furthermore, less hardware complexity is needed in our design compared with traditional four-parallel approach. The proposed FFT/IFFT processor is designed in a 0.13-mum single-poly and eight-metal CMOS process. The core area is 660times2142 mum2 , including an FFT/IFFT processor and a test module. At the operation clock rate of 40 MHz, our proposed processor can calculate 128-point FFT with four independent data sequences within 3.2 mus meeting IEEE 802.11n standard requirements  相似文献   

5.
A dynamic scaling FFT processor for DVB-T applications   总被引:1,自引:0,他引:1  
This paper presents an 8192-point FFT processor for DVB-T systems, in which a three-step radix-8 FFT algorithm, a new dynamic scaling approach, and a novel matrix prefetch buffer are exploited. About 64 K bit memory space can be saved in the 8 K point FFT by the proposed dynamic scaling approach. Moreover, with data scheduling and pre-fetched buffering, single-port memory can be adopted without degrading throughput rate. A test chip for 8 K mode DVB-T system has been designed and fabricated using 0.18-/spl mu/m single-poly six-metal CMOS process with core area of 4.84 mm/sup 2/. Power dissipation is about 25.2 mW at 20 MHz.  相似文献   

6.
In this brief, multi-path delay commutator structures are utilized to improve the throughput rate of radix-2 and radix-4 FFT computation by a factor of 2 to 4. Latency can also be reduced by a factor of 2 to 3. Compared with previous radix-2 and radix-4 FFT structures, the proposed high-throughput FFT with doubled throughput rate requires similar or even less hardware cost. Although split radix FFT design is more hardware efficient, the regular structure of proposed FFT structures are attractive for high throughput FFT design. [All rights reserved Elsevier].  相似文献   

7.
With the upcoming fourth generation wireless systems and convergence of multiple radio standards into a single terminal, building blocks are needed that can be configured for computing various algorithms used in different standards. Given these trends and requirements, in our previous paper we successfully proposed a circuit-sharing design for the DAB receiver to integrate the FFT (fast Fourier transform) and IMDCT (inverse modified discrete cosine transform) operations into the same functional circuit. The proposed technique reduces hardware overhead, enhances circuit efficiency and significantly reduces the cost of DAB receivers. To further improve the efficiency of our previous work, this investigation proposes another alternative design named the circuit-sharing pipeline design (CSPD) using a single processor with a pipeline scheme to combine two functions, FFT and IMDCT, into the same circuit. The proposed method reduces the required chip area and cost of DAB receivers. Analyzing the existing relationship among IMDCT, DCT (discrete cosine transform) and FFT, the IMDCT function can be replaced using a FFT function. Therefore, it is not necessary to design an extra circuit for IMDCT. The arithmetic unit in the FFT processor can be significantly reduced due to employing the pipeline scheme. Consequently, the circuit redundancies in the IMDCT and FFT functions can be easily eliminated to allow exploitation of the decreased chip area. Results of this study demonstrate that the proposed design can provide advantages such as low gate count and small memory size in the DAB (digital audio broadcasting) receiver.  相似文献   

8.
A pipelined Fast Fourier Transform and its inverse (FFT/IFFT) processor, which utilizes hardware resources efficiently, is proposed for MIMO-OFDM WLAN 802.11n. Compared with a conventional MIMO-OFDM implementation, (in which as many FFT/IFFT processors as the number of transmit/receive antennas is used), the proposed architecture (using hardware sharing among multiple data sequences) reduces hardware complexity without sacrificing system throughput. Further, the proposed architecture can support 1–4 input data sequences with sequence lengths of 64 or 128, as needed. The FFT/IFFT processor is synthesized using TSMC 0.18 um CMOS technology and saves 25% area compared to a conventional implementation approach using radix-23 algorithm. The proposed FFT/IFFT processor can be configured to improve power efficiency according to the number of input data sequences and the sequence length. The processor consumes 38 mW at 75 MHz for one input sequence with 64-point length; it consumes 87 mW at 75 MHz for four input sequences with length 128-point and can be efficiently used for IEEE 802.11n WLAN standard.
Paul AmpaduEmail:
  相似文献   

9.
A design space exploration methodology of 1-D FFT processor is proposed to find the best hardware architecture in a quantitative way during early design. The methodology includes architecture candidate collection, coarse-grained architecture selection, and circuit level design optimizations. We show how to select a better architecture from candidates including different architectures (SDF, SDC, MDF, MDC and memory-based) with different degree of parallelism at different radices. The sub-level designs, including designs of rotator and data scaling module, are introduced for further optimizations. As a proof of concept, an FFT processor for 4G, WLAN and future 5G is designed supporting 16-4096 and 12-2400 point FFTs. Memory-based architecture with 16-datapath mixed-radix butterfly unit is selected to satisfy the demands for 1GS/s (4096) throughput. The synthesis result based on 65nm technology shows that the silicon cost and power consumption are 1.46mm2 and 68.64mW respectively. The proposed processor has better normalized throughput per area unit and normalized FFTs per energy unit than the state of the art available designs.  相似文献   

10.
11.
This paper proposes a circuit-sharing approach to improve efficiency for the key digital audio broadcasting (DAB) techniques, i.e., MPEG1-audio decoding and orthogonal frequency division multiplexing (OFDM). Because OFDM's fast Fourier transform (FFT) requires heavy computational power for implementation, a single butterfly processing element (BPE) is adopted to reduce the chip area required for FFT. Furthermore, by modifying the BPE logical combinational circuit, both IMDCT (inverse modified discrete cosine transform) and FFT functions can be obtained simultaneously from a single VLSI chip. Therefore, the proposed technique reduces hardware overhead, enhances circuit efficiency and significantly reduces the cost of DAB receivers. The proposed circuit is simulated as a VLSI prototype chip using a 0.35 /spl mu/m CMOS process, with a chip area of about 22.09 mm/sup 2/ and a total gate count of approximately 10839 (excluding ROM and RAM).  相似文献   

12.
对FFT处理器的实现算法-频域抽取基4算法做了介绍。介绍一种以FPGA作为设计载体,设计和实现一套集成于FPGA内部的FFT处理器的方法和设计过程。FFT处理器的硬件试验结果表明该处理器的运算结果正确,并且具有较高运算速度。该方法具有设计简单灵活,体积小等优点,可用于雷达处理、高速图像处理和数字通信等应用场合。  相似文献   

13.
This paper presents a high throughput size-configurable floating point (FP) Fast Fourier Transform (FFT) processor, having implemented the 8-parallel multi-path delay feedback (MDF) functions suitable for applications in the real-time radar imaging system. With regard to floating-point FFT design, to acquire a high throughput with restricted area and power consumptions poses as a greater challenge due to some higher degrees of complexity involved in realizing of FP operations than those fixed-point counterparts. To address the related issues, a novel mixed-radix FFT algorithm featuring the single-sided binary-tree decomposition strategy is proposed aiming at effectively containing the complexity of multiplications for any 2k-point FFT. To this aid, the parallel-processing twiddle factor generator and the dual addition-and-rounding fused FP arithmetic units are optimized to meet the high accuracy demand in computation and the low power budget in implementation. The proposed FP FFT processor has been designed in silicon based on SMIC's 28 nm CMOS technology with the active area of 1.39 mm2. The prototype design delivers a throughput of 4 GSample/s at 500 MHz, at a peak power consumption of 84.2 mW. Thus, the proposed design approach achieves a significant improvement in power efficiency approximately by 14 times on average over some other FP FFT processors previously reported.  相似文献   

14.
设计了一种应用于超宽带(UWB)无线通信系统中的FFT/IFFT处理器。该处理器采用基24算法进行FFT运算,利用8路并入并出的流水线结构实现该算法,提高了处理器的数据吞吐率,降低了芯片功耗。提出了一种新颖的数据处理方式,在保证信噪比的情况下节约了逻辑资源。在乘法器的设计环节,针对UWB系统的具体特点,在结构上对乘法器进行了改进和优化,提高了乘法器的性能。最后,设计的FFT/IFFT处理器采用TSMC 0.18μm CMOS标准工艺库综合,芯片的内核面积为0.762mm2(不含测试电路)。在1.8V,25℃条件下,最大工作时钟317.199MHz,在UWB典型的工作频率下,内核功耗为33.5304mW。  相似文献   

15.
文中设计了一款64点基-4FFT处理器,用改进的CORDIC (MVR-CORDIC)处理单元代替常规FFT处理器中的复数乘法器,改进的CORDIC处理单元在保证SQNR性能下,仅用极少次数的移位加法运算即可完成一次复数乘法,缩减了完成一次基本蝶形运算的时间并减小了面积开销。该FFT处理器结构采用两块独立的RAM,并对中间数据作“乒-乓”式存储操作以节省数据存储时间,从而提高完成一次FFT运算的速度。所设计的FFT处理器通过FPGA进行验证,结果表明平均完成一次64点FFT运算仅需要不到1μs。  相似文献   

16.
流水线结构FFT/IFFT处理器的设计与实现   总被引:1,自引:0,他引:1  
针对实时高速信号处理的要求,设计并实现了一种高效的FFT处理器。在分析了FFT算法的复杂度和硬件实现结构的基础上,处理器采用了按频率抽取的基—4算法,分级流水线以及定点运算结构。可以根据要求设置成4P点的FFT或IFFT。处理器可以对多个输入序列进行连续的FFT运算,消除了数据的输入输出对延时的影响。平均每完成一次N点FFT运算仅需要Ⅳ个时钟周期。整个设计基于Verilog HDL语言进行模块化设计。并在Altera公司的Cyclone Ⅱ器件上实现。  相似文献   

17.
Processor is the core chip of modern information system, which is severely threatened by hardware Trojan. Side-channel analysis is the most promising method for hardware Trojan detection. However, most existing detection methods require golden chips as reference, which significantly increases the test cost and complexity. In this paper, we propose a golden-free detection method that exploits the bit power consistency of processor. For the data activated processor hardware Trojan, the power model of processor is modified. Two decomposition methods of power signal are proposed: the differential bit power consistency analysis and the contradictory equations solution. With the proposed method, each bit power can be calculated. The bit consistency based detection algorithms are proposed, the deviation boundaries are obtained by statistical analysis. Experimental measurements were done on field programmable gate array chip with open source 8051 core and hardware Trojans. The results showed that the differences between the two methods were very small. The data activated processor hardware Trojans were detected successfully.  相似文献   

18.
This investigation proposes a novel radix-42 algorithm with the low computational complexity of a radix-16 algorithm but the lower hardware requirement of a radix-4 algorithm. The proposed pipeline radix-42 single delay feedback path (R42SDF) architecture adopts a multiplierless radix-4 butterfly structure, based on the specific linear mapping of common factor algorithm (CFA), to support both 256-point fast Fourier transform/inverse fast Fourier transform (FFT/IFFT) and 8times8 2D discrete cosine transform (DCT) modes following with the high efficient feedback shift registers architecture. The segment shift register (SSR) and overturn shift register (OSR) structure are adopted to minimize the register cost for the input re-ordering and post computation operations in the 8times8 2D DCT mode, respectively. Moreover, the retrenched constant multiplier and eight-folded complex multiplier structures are adopted to decrease the multiplier cost and the coefficient ROM size with the complex conjugate symmetry rule and subexpression elimination technology. To further decrease the chip cost, a finite wordlength analysis is provided to indicate that the proposed architecture only requires a 13-bit internal wordlength to achieve 40-dB signal-to-noise ratio (SNR) performance in 256-point FFT/IFFT modes and high digital video (DV) compression quality in 8 times 8 2D DCT mode. The comprehensive comparison results indicate that the proposed cost effective reconfigurable design has the smallest hardware requirement and largest hardware utilization among the tested architectures for the FFT/IFFT computation, and thus has the highest cost efficiency. The derivation and chip implementation results show that the proposed pipeline 256-point FFT/IFFT/2D DCT triple-mode chip consumes 22.37 mW at 100 MHz at 1.2-V supply voltage in TSMC 0.13-mum CMOS process, which is very appropriate for the RSoCs IP of next-generation handheld devices.  相似文献   

19.
In this paper, we present a fast Fourier transform (FFT) processor with four parallel data paths for multiband orthogonal frequency‐division multiplexing ultra‐wideband systems. The proposed 128‐point FFT processor employs both a modified radix‐24 algorithm and a radix‐23 algorithm to significantly reduce the numbers of complex constant multipliers and complex booth multipliers. It also employs substructure‐sharing multiplication units instead of constant multipliers to efficiently conduct multiplication operations with only addition and shift operations. The proposed FFT processor is implemented and tested using 0.18 µm CMOS technology with a supply voltage of 1.8 V. The hardware‐ efficient 128‐point FFT processor with four data streams can support a data processing rate of up to 1 Gsample/s while consuming 112 mW. The implementation results show that the proposed 128‐point mixed‐radix FFT architecture significantly reduces the hardware cost and power consumption in comparison to existing 128‐point FFT architectures.  相似文献   

20.
In this paper, a VLSI architecture based on radix-2/sup 2/ integer fast Fourier transform (IntFFT) is proposed to demonstrate its efficiency. The IntFFT algorithm guarantees the perfect reconstruction property of transformed samples. For a 64-points radix-2/sup 2/ FFT architecture, the proposed architecture uses 2 sets of complex multipliers (six real multipliers) and has 6 pipeline stages. By exploiting the symmetric property of lossless transform, the memory usage is reduced by 27.4%. The whole design is synthesized and simulated with a 0.18-/spl mu/m TSMC 1P6M standard cell library and its reported equivalent gate count usage is 17,963 gates. The whole chip size is 975 /spl mu/m/spl times/977 /spl mu/m with a core size of 500 /spl mu/m/spl times/500 /spl mu/m. The core power consumption is 83.56 mW. A Simulink-based orthogonal frequency demodulation multiplexing platform is utilized to compare the conventional fixed-point FFT and proposed IntFFT from the viewpoint of system-level behavior in items of signal-to-quantization-noise ratio (SQNR) and bit error rate (BER). The quantization loss analysis of these two types of FFT is also derived and compared. Based on the simulation results, the proposed lossless IntFFT architecture can achieve comparative SQNR and BER performance with reduced memory usage.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号