期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Radix-2 FFT butterfly processor using distributed arithmetic

MacTaggart I.R. Jack M.A. 《Electronics letters》1983,19(2):43-44

A parallel-data VLSI architecture for computation of the fast Fourier transform (FFT) is described. The processor is based on a computationally efficient vector rotate algorithm. Use of a 2-dimensional pipeline configuration allows a radix-2 butterfly operation to be performed once every system clock cycle (250 ns) to generate real or imaginary transform components. The architecture is considered to be a computationally efficient VLSI approach for high-bandwidth computation of the FFT. The design and performance of an 8-bit FFT butterfly processor are described. 相似文献

2.

一种新的1024点基-2 FFT旋转因子产生电路的结构

李金城杨华中《半导体学报》2004,25(4)

设计了一个新的无存储器的基-2 1024点FFT旋转因子产生电路.这个旋转因子产生电路用若干逻辑模块来产生数据,然后用这些数据合成所需要的旋转因子.用Synopsys Power Compiler进行功耗分析表明,用TSMC 0.25μm CMOS工艺综合出来的电路在50MHz时的功耗为2mW.这种旋转因子产生电路非常适合用于低功耗的设计中,尤其是移动通信和其他手持设备中. 相似文献

3.

Design of power efficient butterflies from Radix-2 DIT FFT using adder compressors with a new XOR gate topology

Mateus Beck Fonseca Eduardo A. César da Costa João B. S. Martins 《Analog Integrated Circuits and Signal Processing》2012,73(3):945-954

This paper addresses the design of power efficient dedicated structures of Radix-2 Decimation in Time (DIT) pipelined butterflies, aiming the implementation of low power Fast Fourier Transform (FFT), using adder compressors, with a new XOR gate topology. In the FFT computation, the butterflies play a central role, since they allow calculation of complex terms. In this calculation, involving multiplications of input data with appropriate coefficients, the optimization of the butterfly can contribute for the reduction of power consumption of FFT architectures. In this paper, different and dedicated structures for the 16 bit-width pipelined Radix-2 DIT butterfly, running at 100 MHz, are implemented, where the main goal is to minimize both the number of real multipliers and the critical path of the structures. This is done by changing the structure of the complex multipliers and applying them into the butterflies. For logic synthesis of the implemented butterflies it was used Cadence Encounter RTL Compiler tool with XFAB MOSLP 0.18 μm library. Area and power consumption results are presented for the synthesized butterflies. Regarding power consumption, switching activity analysis is performed using 10,000 inputs vectors at inputs of the butterflies. The main results show that when combining the use of pipeline approach and the use of efficient adder compressors, with a new XOR gate topology, the power consumption of the butterflies is significantly reduced. 相似文献

4.

基于Sklansky结构的24位并行前缀加法器的设计与实现

《现代电子技术》2015,(21):145-148

针对串行进位加法器存在的延时问题,采用一种基于Sklansky结构的并行前缀加法器,通过对并行前缀加法器各个模块进行优化,设计实现了一个24位并行前缀加法器。通过与24位串行进位加法器进行延时比较,结果表明,Sklansky并行前缀结构的加法器,能有效提高运算速度。相似文献

5.

一种新的1024点基-2 FFT旋转因子产生电路的结构

李金城杨华中《半导体学报》2004,25(4):377-382

设计了一个新的无存储器的基-2 1024点FFT旋转因子产生电路.这个旋转因子产生电路用若干逻辑模块来产生数据,然后用这些数据合成所需要的旋转因子.用Synopsys Power Compiler进行功耗分析表明,用TSMC 0.25μm CMOS工艺综合出来的电路在50MHz时的功耗为2mW.这种旋转因子产生电路非常适合用于低功耗的设计中,尤其是移动通信和其他手持设备中. 相似文献

6.

一种高效的面向基2 FFT算法的SIMD并行存储结构

下载免费PDF全文

陈海燕杨超刘胜刘仲《电子学报》2016,44(2):241-246

随着SIMD(Single Instruction Multiple Data stream)结构DSP(Digital Signal Processor)片上集成了越来越多的处理单元,并行访存的灵活性及带宽效率对实际运算性能的影响越来越大.本文详细分析了一般SIMD结构DSP中基2 FFT(Fast Fourier Transform)并行算法面临的访存问题,采用简单的部分地址异或逻辑完成SIMD并行访存地址转换,实现了FFT运算的无冲突SIMD并行访存;提出了几种带特殊混洗模式的向量访存指令,可完全消除SIMD结构下基2 FFT运算时需要的额外混洗指令操作.最后将其应用于某16路SIMD数字信号处理器YHFT-Matrix2中向量存储器VM的优化设计.测试结果表明,采用该SIMD并行存储结构优化的VM以增加18%的硬件开销实现了FFT运算全流水无冲突并行访存和100%并行访存带宽利用率;相比优化前的设计,不同点数FFT运算可获得1.32~2.66的加速比. 相似文献

7.

基-2FFT处理器的FPGA实现

张辉张记龙《中国集成电路》2009,18(6):26-30

针对当前数字信号处理领域对快速傅里叶变换应用的广泛需求,在对算法原理分析的基础上,给出了8点基-2按时间抽选FFT处理器的实现方案;并在Xilinx xc3s 1500系列芯片上进行综合,通过Modelsim SE6．0对程序进行了仿真。实验结果表明,该处理器功能实现正确,并且具有较高的运算速度和精度。相似文献

8.

Radioactive mineral identification based on FFT Radix-2 algorithm

Sharma A. 《Electronics letters》2004,40(9):536-537

A fast Fourier transform (FFT) Radix-2 based approach using a gamma-ray spectrometer for the identification of radioactive mineral is presented. The proposed method reduces the complexity of diagnosis associated with the spectrometer by transferring time-domain analysis to frequency-domain analysis. The operation of PDA/DSP is discussed pragmatically. 相似文献

9.

A dynamic scaling FFT processor for DVB-T applications 总被引：1，自引：0，他引：1

Yu-Wei Lin Hsuan-Yu Liu Chen-Yi Lee 《Solid-State Circuits, IEEE Journal of》2004,39(11):2005-2013

This paper presents an 8192-point FFT processor for DVB-T systems, in which a three-step radix-8 FFT algorithm, a new dynamic scaling approach, and a novel matrix prefetch buffer are exploited. About 64 K bit memory space can be saved in the 8 K point FFT by the proposed dynamic scaling approach. Moreover, with data scheduling and pre-fetched buffering, single-port memory can be adopted without degrading throughput rate. A test chip for 8 K mode DVB-T system has been designed and fabricated using 0.18-/spl mu/m single-poly six-metal CMOS process with core area of 4.84 mm/sup 2/. Power dissipation is about 25.2 mW at 20 MHz. 相似文献

10.

A VLSI array processor for 16-point FFT 总被引：1，自引：0，他引：1

Lee Moon-Key Shin Kyung-Wook Lee Jang-Kyu 《Solid-State Circuits, IEEE Journal of》1991,26(9):1286-1292

An implementation of a two-dimensional array processor for fast Fourier transform (FFT) using a 2-μm CMOS technology is presented. The array processor, which is dedicated to 16-point FFT, implements a 4×4 mesh array of 16 processing elements (PEs) working in parallel. Design considerations in both the chip level and the PE level are examined. A layout design methodology based on bit-slice units (BSUs) results in a very simple design, easy debugging, and a regular interconnection scheme through abutment. It contains about 48,000 transistors on an area of 53.52 mm², excluding the 83-pad area, and operation is on a 15-MHz clock. The array processor performs 24.6 million complex multiplications per second, and computes a 16-point FFT in 3 μs 相似文献

11.

Designing novel reversible BCD adder and parallel adder/subtraction using new reversible logic gates

Rigui Zhou Qian Wu Yang Shi 《International Journal of Electronics》2013,100(10):1395-1414

Reversible logic has received much attention in recent years when calculation with minimum energy consumption is considered. Especially, interest is sparked in reversible logic by its applications in some technologies, such as quantum computing, low-power CMOS design, optical information processing and nanotechnology. This article proposes two new reversible logic gates, ZRQ and NC. The first gate ZRQ not only implements all Boolean functions but also can be used to design optimised adder/subtraction architectures. One of the prominent functionalities of the proposed ZRQ gate is that it can work by itself as a reversible full adder/subtraction unit. The second gate NC can complete overflow detection logic of Binary Coded Decimal (BCD) adder. This article proposes two approaches to design novel reversible BCD adder using new reversible gates. A comparative result which is presented shows that the proposed designs are more optimised in terms of number of gates, garbage outputs, quantum costs and unit delays than the existing designs. 相似文献

12.

A radix-8 wafer scale FFT processor 总被引：2，自引：0，他引：2

Earl E. Swartzlander Jr. Vijay K. Jain Hiroomi Hikawa 《The Journal of VLSI Signal Processing》1992,4(2-3):165-176

Wafer Scale Integration promises radical improvements in the performance of digital signal processing systems. This paper describes the design of a radix-8 systolic (pipeline) fast Fourier transform processor for implementation with wafer scale integration. By the use of the radix-8 FFT butterfly wafer that is currently under development, continuous data rates of 160 MSPS are anticipated for FFTs of up to 4096 points with 16-bit fixed point data. 相似文献

13.

A VLSI processor for parallel contour tracing

Agi I. Hurst P.J. Jain A.K. 《Signal Processing, IEEE Transactions on》1992,40(2):429-438

相似文献

14.

一款基于MVR-CORDIC的高速64点基-4FFT处理器

侯卫华郭晖刘明峰于宗光《电子与封装》2008,8(5):22-25

文中设计了一款64点基-4FFT处理器,用改进的CORDIC （MVR-CORDIC）处理单元代替常规FFT处理器中的复数乘法器,改进的CORDIC处理单元在保证SQNR性能下,仅用极少次数的移位加法运算即可完成一次复数乘法,缩减了完成一次基本蝶形运算的时间并减小了面积开销。该FFT处理器结构采用两块独立的RAM,并对中间数据作“乒-乓”式存储操作以节省数据存储时间,从而提高完成一次FFT运算的速度。所设计的FFT处理器通过FPGA进行验证,结果表明平均完成一次64点FFT运算仅需要不到1μs。相似文献

15.

Fast parallel prefix logic circuits for n2n round-robin arbitration

H. Fatih Ugurdag Onur Baskirt 《Microelectronics Journal》2012,43(8):573-581

An n2n round-robin arbiter (RRA) searches its n inputs for a 1, starting from the highest-priority input. It picks the first 1 and outputs its index in one-hot encoding. RRA aims to be fair to its inputs and maintains fairness by simply rotating the input priorities, i.e., the last arbitrated input becomes the lowest-priority input. Arbiters are used to multiplex the usage of shared resources among requestors as well as in dispatch logic where the purpose is load balancing among multiple resources. Today, arbiters have hundreds of ports and usually need to run at very high clock speeds. This article presents a new gate-level RRA circuit called Thermo Coded-Parallel Prefix Arbiter (TC-PPA) that scales to any number of requestors. It uses parallel prefix network topologies (borrowed from fast carry lookahead adders) to generate a thermometer-coded pointer, thus greatly reducing critical path. Code generators were written not only for TC-PPA but also for the 5 highly competitive circuits in the literature (9 including their variants), and a rich set of timing/area results were obtained using a standard-cell based logic synthesis flow with a novel iterative strategy based on binary search. Synthesis runs include results with wire-load and without. Results show that for 54 or more ports (except 256) TC-PPA offers the best timing (lowest latency) as well as competitive area. Contributions also include transaction-level simulations that show when pipelining is used to boost clock rate, latency and input FIFO sizes are adversely affected, and hence pipelining cannot be indiscriminately exploited to trim clock period. 相似文献

16.

Static quantised radix-2 fast Fourier transform (FFT)/inverse FFT processor for constraints analysis

Rozita Teymourzadeh Memtode Jim Abigo Mok Vee Hoong 《International Journal of Electronics》2013,100(2):231-240

This research work focuses on the design of a high-resolution fast Fourier transform (FFT)/inverse FFT (IFFT) processors for constraints analysis purpose. Amongst the major setbacks associated with such high-resolution FFT processors are the high-power consumption resulting from the structural complexity and computational inefficiency of floating-point calculations. As such, a parallel pipelined architecture was proposed to statically scale the resolution of the processor to suite adequate trade-off constraints. The quantisation was applied to provide an approximation to address the finite word-length constraints of digital signal processing. An optimum operating mode was proposed, based on the signal-to-quantisation-noise ratio (SQNR) as well as the statistical theory of quantisation, to minimise the trade-off issues associated with selecting the most application-efficient floating-point processing capability in contrast to their resolution quality. 相似文献

17.

An architecture for a VLSI FFT processor

Joseph Ja'Ja' Robert Michael Owens 《Integration, the VLSI Journal》1983,1(4):305-316

We propose a new VLSI architecture for an FFT processor. Our architecture uses few processing elements and can be laid out in a mesh-interconnected pattern. We show how to compute the discrete Fourier transform at n points with an optimal speed-up as long as the memory is large enough. The control is shown to be simple and easily implementable in VLSI. 相似文献

18.

A fully parallel vector-quantization processor for real-timemotion-picture compression

Nakada A. Shibata T. Konda M. Morimoto T. Ohmi T. 《Solid-State Circuits, IEEE Journal of》1999,34(6):822-830

A vector-quantization (VQ) processor system has been developed aiming at real-time compression of motion pictures using a 0.6-μm triple-metal CMOS technology. The chip employs a fully parallel single-instruction, multiple-data architecture having a two-stage pipeline. Each pipeline segment consists of 19 cycles, thus enabling the execution of a single VQ operation in only 19 clock cycles. As a result, it has become possible to encode a full-color picture of 640×480 pixels in less than 33 ms, i.e., the real-time compression of moving pictures has become available. The chip is scalable up to eight-chip master-slave configuration in conducting fully parallel search for 2-K template vectors. The chip operates at 17 MHz with a power dissipation of 0.29 W under a power-supply voltage of 3.3 V 相似文献

19.

A parallel full adder circuit using Josephson junctions

《Solid-State Circuits, IEEE Journal of》1981,16(1):43-48

Detailed investigations have been carried out on a Josephson parallel full adder as an example of a functional circuit using Josephson junctions. This circuit can be constructed with fewer devices as compared with conventional Josephson adders. Two-junction interferometers (d.c.-SQUIDs) are utilized as switching elements of the circuit. Only (2N+1) d.c.-SQUIDs are required for construction of an N bit circuit. Discussions focus on design theory of the full adder circuit. An experimental 4 bit circuit operation is also demonstrated. 相似文献

20.

可变长数据全并行FFT地址生成方法

刘红侠黄巾黄士坦《信号处理》2009,25(2)

本文通过对混合基4/2 FFT算法的分析,在优化采样数据、旋转因子存储及读取方法的基础上,提出了将N=2m点,m为奇、偶两种情况的地址产生统一于同一函数的算法,并设计了简单的插入值产生及快速插入位置控制电路,从而用一个计数器、同一套地址产生硬件,通过简单的开关模式控制,可实现任意长度FFT变换的地址产生单元,该地址产生单元在一个时钟周期内产生读取所需旋转因子及并行访存4个操作数的地址.本文设计的FFT处理器每周期完成一个基4或2个基2蝶式运算,在吞吐率高、资源少的基础上实现了处理长度可编程的灵活性,同时避免了旋转因子重复读取,降低功耗. 相似文献