期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

快速浮点加法器的优化设计 总被引：3，自引：0，他引：3

王颖林正浩《电子工程师》2004,30(11):24-26

运算器的浮点数能够提供较大的表示精度和较大的动态表示范围,浮点运算已成为现代计算程序中不可缺少的部分.浮点加法运算是浮点运算中使用频率最高的运算,因此,浮点加法器的性能影响着整个CPU的浮点处理能力.文中从分析浮点加减操作的基本算法入手,介绍了一种新的算法,即三数据通道浮点加法算法,并着重介绍了整数加法器和移位器的设计,对32位浮点加法器的设计进行了优化. 相似文献

2.

浮点FFT的VHDL实现

程俊《现代电子技术》2005,28(21):58-59,62

随着集成电路技术的发展,电子设计自动化逐渐成为重要的设计手段,已经广泛应用于数字电路和数字信号处理系统等许多领域.文中介绍了基于VHDL语言设计的浮点FFT,本设计采用基2算法,单精度32位二进制的浮点形式,主控制器采用状态机建模.整个设计利用Xilinx公司提供的先进的ISE 5.3系列软件,采用了先进的结构化设计思想.总设计通过了Modelsim仿真与验证,二十多个模块的代码覆盖率达到100%.实践结果表明,应用VHDL实现的FFT处理器可快速完成浮点数据快速傅式变换,代码覆盖率也表明系统的测试工作比较完备.该系统可扩展到16点,32点的浮点FFT运算. 相似文献

3.

产品汇总:微控制器/微处理器

《今日电子》2011,(1):62

高性能、低功耗32位浮点数字信号处理器高性能SHARC2148x及低功耗SHARC 2147x系列处理器集成高达5Mb的存储器,为多种应用提供了单芯片、浮点信号处理精度,并为便携式设备实现了高端系统功能。SHARC 2148x系列处理器比其他32位浮点DSP产品在性能上提高了33%(400MHz),SHARC2147x系列处相似文献

4.

LSRISC32位浮点陈列乘法器的设计 总被引：5，自引：2，他引：3

许琪沈绪榜钱刚李莉赵宁《微电子学与计算机》2001,18(4):19-24

文章介绍LSRISC中的32位浮点乘法器的设计,它可用于完成定点32位整数与序数的乘法操作和IEEE754规定的单精度扩展浮点数据的乘法。相似文献

5.

SHARC 2148x／47x：32位浮点DSP

《世界电子元器件》2010,(9):26-26

ADI最新推出32位浮点数字信号处理器（DSP）SHARC2148x及SHARC2147x系列。高性能SHARC2148x及低功耗2147x处理器凭借集成高达5Mb的存储器，为各种应用提高了单芯片、浮点信号处理精度，并为便携式设备实现了高端系统功能。相似文献

6.

32位浮点嵌入式MCU设计研究 总被引：1，自引：2，他引：1

唐明哲邵志标赵宁许琪《微电子学与计算机》2004,21(7):30-33

本文介绍了一个基于RISC体系结构的32位浮点嵌入式MCU的设计实现。该：MCU内含128kbit的SRAM、采用哈佛结构、四级指令流水线、32位指令字长和内部43位数据字长。MCU内部设置多个快速寄存器及采用硬连线逻辑代替微程序控制的方法，加快了微处理器的速度，提高了指令执行效率。设计中还采用对寄存器同步写、异步读的方式避免了数据相关问题。相似文献

7.

FPGA中浮点乘法器的实现 总被引：2，自引：0，他引：2

金美华宋万杰吴顺君《火控雷达技术》2008,37(1):104-107

该文设计的适合于在FPGA中实现的乘法器结构,采用自定义的26位浮点数据格式,利用改进的基4Booth编码方式,以及CSA和4-2压缩器综合的Wallace 树形结构,在尾数的舍入中应用基于预测和选择的快速舍入方法,优化了乘法器的性能.最后给出在PFGA中的仿真结果,验证了设计的正确性,并和32位浮点数据格式的运算结果作比较,发现本设计不但减少占用FPGA内部资源,而且加快了运算速度. 相似文献

8.

32位RISC微处理器"龙腾R2"浮点流水线的设计和实现

李大鹏张盛兵罗旻《微电子学与计算机》2006,23(1):188-191

文章介绍了32位RISC微处理器“龙腾R2”浮点处理单元的体系结构和设计，重点讨论了乱序执行、乱序、结束的高性能浮点流水线设计。为了实现流水线中的精确中断响应，本文采用了一种基于操作数指数和操作类型的浮点异常预测的方法．根据预测结果决定流水线的发射策略。基于0．18μm标准单元综合的结果表明：采用该方法实现的浮点处理流水线．与顺序控制和基于Tomasub算法实现的浮点处理单元相比，整个FPU在付出较少硬件面积的情况下得到了理想的效果．满足功能和时序要求。相似文献

9.

一种深度流水线的浮点加法器

下载免费PDF全文

邵杰伍万棱余汉城《电子器件》2007,30(3):911-914

随着数字信号处理技术的发展,FPGA正越来越频繁地用于实现基于高速硬件的高性能的科学计算.本文通过增加浮点加法器的流水线级数来提高其单位时间的吞吐量,探讨了充分利用FPGA内部丰富的触发器来提高系统主频的可行性.提出了一种指数和尾数操作、加法和减法操作均分离的多路径浮点加法器结构,对于单精度(32位)的操作数,采用Altera公司的StratixⅡ系列芯片,8级流水线可以达到356 MHz以上的速度. 相似文献

10.

SHARCDSP产品组合在高性能、低功耗浮点处理精度方面实现飞跃

《国外电子元器件》2010,(5):148-148

AnalogDevices．Ine．推出32位浮点数字信号处理器SHARC产品组合的最新成员-SHARC2148x及SHARC2147x系列。高性能SHARC2148x及低功耗SHARC2147x系列处理器凭借集成高达5Mb的存储器．为各种应用提高了单芯片、浮点信号处理精度,并为便携式设备实现了高端系统功能。通过SHARC2148x及SHARC2147x处理器, 相似文献

11.

基于TMS320VC5416的FFT算法的实现

贾玮杨录张艳花《山西电子技术》2009,(2):11-13

快速傅里叶变换（FFT）是一种将信号从时域变换到频域的变换形式,是声学、图像、电信和信号处理等领域中一种重要的分析工具。近年来,专用的数字信号处理器以其优化的硬件结构和优良的性能价格比为FFT的实现提供了一个有效的途径。详细介绍了以浮点型DSP5416为核心的实现FFT算法的硬件平台设计。相似文献

12.

A reconfigurable 4-GS/s power-efficient floating-point FFT processor design and implementation based on single-sided binary-tree decomposition

《Integration, the VLSI Journal》2019

This paper presents a high throughput size-configurable floating point (FP) Fast Fourier Transform (FFT) processor, having implemented the 8-parallel multi-path delay feedback (MDF) functions suitable for applications in the real-time radar imaging system. With regard to floating-point FFT design, to acquire a high throughput with restricted area and power consumptions poses as a greater challenge due to some higher degrees of complexity involved in realizing of FP operations than those fixed-point counterparts. To address the related issues, a novel mixed-radix FFT algorithm featuring the single-sided binary-tree decomposition strategy is proposed aiming at effectively containing the complexity of multiplications for any 2^k-point FFT. To this aid, the parallel-processing twiddle factor generator and the dual addition-and-rounding fused FP arithmetic units are optimized to meet the high accuracy demand in computation and the low power budget in implementation. The proposed FP FFT processor has been designed in silicon based on SMIC's 28 nm CMOS technology with the active area of 1.39 mm². The prototype design delivers a throughput of 4 GSample/s at 500 MHz, at a peak power consumption of 84.2 mW. Thus, the proposed design approach achieves a significant improvement in power efficiency approximately by 14 times on average over some other FP FFT processors previously reported. 相似文献

13.

混合基可重构FFT处理器的设计与实现

宋宇鲲曲双双徐礼晗张多利《微电子学与计算机》2020,(1):87-92,98

本文提出了一种新型混合基可重构FFT处理器,由支持基-2/3FFT的新型可重构蝶形单元和多路并行无冲突的存储器组成,实现了FFT过程中多路数据并行性和操作的连续性.本设计在TSMC28nm工艺下的最高频率为1.06GHz,同时在Xilinx的XC7V2000T FPGA芯片上搭建了混合基FFT处理器硬件测试系统.对混合基FFT处理器的FPGA硬件测试结果表明,本设计支持基-2、基-3和基-2/3混合模式FFT变换,且执行速度达到给定蝶乘器数量下的理论周期值,对单精度浮点数,混合基FFT处理器可提供10-5的结果精度. 相似文献

14.

Design of a coarse-grained reconfigurable architecture with floating-point support and comparative study

Manhwee Jo Dongwook Lee Kyuseung Han Kiyoung Choi 《Integration, the VLSI Journal》2014

With a huge increase in demand for various kinds of compute-intensive applications in electronic systems, researchers have focused on coarse-grained reconfigurable architectures because of their advantages: high performance and flexibility. This paper presents FloRA, a coarse-grained reconfigurable architecture with floating-point support. A two-dimensional array of integer processing elements in FloRA is configured at run-time to perform floating-point operations as well as integer operations. Fabricated using 130 nm process, the total area overhead due to additional hardware for floating-point operations is about 7.4% compared to the previous architecture which does not support floating-point operations. The fabricated chip runs at 125 MHz clock frequency and 1.2 V power supply. Experiments show 11.6× speedup on average compared to ARM9 with a vector-floating-point unit for integer-only benchmark programs as well as programs containing floating-point operations. Compared with other similar approaches including XPP and Butter, the proposed architecture shows much higher performance for integer applications, while maintaining about half the performance of Butter for floating-point applications. 相似文献

15.

Improving Floating-Point Performance in Less Area: Fractured Floating Point Units (FFPUs)

Neil Hockert Katherine Compton 《Journal of Signal Processing Systems》2012,67(1):31-46

Embedded systems designers often use fixed-point instead of floating-point due to the performance and area overhead of floating-point units. If the range of floating-point representation is required, the system may use a software-based floating-point library on an integer-only processor to save area—at the cost of much lower performance. Instead, we propose a Fractured Floating Point Unit (FFPU)—a hybrid solution that uses a set of custom hardware instructions to accelerate software-based floating-point emulation. An FFPU is intended as a compromise between software libraries and full FPUs in terms of both area and performance. We present four potential 32-bit FFPU designs for a Nios II soft processor. We compare their performance and area to the baseline Nios II, as well as a Nios II with a complete FPU. We show that an FFPU can improve various floating-point operations, including improving addition and subtraction performance by 24 to 52 percent over the baseline. This performance comes at a resource cost of only an 11 to 29 percent ALM increase, and no increase in DSP blocks. 相似文献

16.

Systolic FFT Processors: A Personal Perspective

Earl E. Swartzlander Jr. 《Journal of Signal Processing Systems》2008,53(1-2):3-14

This paper provides a personal perspective on developments in the implementation of two systolic fast Fourier transform processors over the last 25 years and identifies some of the lessons learned. This has been a period of tremendous advancements in integrated circuit technology that is demonstrated by the resulting processors. The first processor is the Modular Transform Processor that was developed at TRW in the 1982–1984 time frame using VLSI technology. It is a set of six large circuit boards that computes 4,096-point fast Fourier transforms using 22-bit floating-point arithmetic at sustained data rates of 40 MSPS. The second processor is a single ASIC chip systolic FFT processor developed by the Mayo Foundation in the 2001–2002 time frame that computes 4,096-point FFTs using 16-bit fixed-point arithmetic at sustained data rates of 200 MSPS. Some thoughts on the future directions of systolic FFT processor development are offered. Future systems will compute large transforms (e.g., 16 K-point to 1 M-point) at high data rates (e.g., 500 MSPS to 1 GSPS), will employ more precise arithmetic (e.g., 32-bit single precision IEEE Standard floating-point arithmetic), will consume very low power (e.g., on the order of one watt) and will be realized on a single chip. 相似文献

17.

Reduced-complexity circuit for neural networks

Watkins S.S. Chau P.M. 《Electronics letters》1995,31(19):1644-1646

The Letter demonstrates that a 10 bit reduced-complexity VLSI circuit can be used in place of a 32 bit floating-point processor to speed up some neural network applications, reducing circuit area and power consumption by 88% with a negligible increase in RMS error. Applications were executed on a radial basis function neurocomputer using the reduced-complexity circuit implemented with FPGA technology. One application produced better results than had been previously obtained for a NASA data set using either neural network or non-neural network approaches 相似文献

18.

高吞吐率双模浮点可重构FFT处理器设计实现

魏星黄志洪杨海钢《电子与信息学报》2018,40(12):3042-3050

高吞吐浮点可灵活重构的快速傅里叶变换(FFT)处理器可满足尖端雷达实时成像和高精度科学计算等多种应用需求。与定点FFT相比,浮点运算复杂度更高,使得浮点型FFT的运算吞吐率与其实现面积、功耗之间的矛盾问题尤为突出。鉴于此,为降低运算复杂度,首先将大点数FFT分解成若干个小点数基2k 级联子级实现,提出分别针对128/256/512/1024/2048点FFT的优化混合基算法。同时,结合所提出同时支持单通道单精度和双通道半精度两种浮点模式的新型融合加减与点乘运算单元,首次提出一款高吞吐率双模浮点可变点FFT处理器结构,并在28 nm标准CMOS工艺下进行设计并实现。实验结果表明,单通道单精度和双通道半精度浮点两种模式下的运算吞吐率和输出平均信号量化噪声比分别为3.478 GSample/s, 135 dB和6.957 GSample/s, 60 dB。归一化吞吐率面积比相比于现有其他浮点FFT实现可提高约12倍。相似文献

19.

Configurable Floating-Point FFT Accelerator on FPGA Based Multiple-Rotation CORDIC

《电子学报:英文版》2016,(6):1063-1070

Fast Fourier transform (FFT) accelerator and Coordinate rotation digital computer (CORDIC) algorithm play important roles in signal processing.We propose a conflgurable floating-point FFT accelerator based on CORDIC rotation,in which twiddle direction prediction is presented to reduce hardware cost and twiddle angles are generated in real time to save memory.To finish CORDIC rotation efficiently,a novel approach in which segmentedparallel iteration and compress iteration based on CSA are presented and redundant CORDIC is used to reduce the latency of each iteration.To prove the efficiency of our FFT accelerator,four FFT accelerators are prototyped into a FPGA chip to perform a batch-FFT.Experimental results show that our structure,which is composed of four butterfly units and finishes FFT with the size ranging from 64 to 8192 points,occupies 33230(3％) REGs and 143006(30％)LUTs.The clock frequency can reach 122MHz.The resources of double-precision FFT is only about 2.5 times of single-precision while the theoretical value is 4.What's more,only 13331 cycles are required to implement 8192-points double-precision FFT with four butterfly units in parallel. 相似文献

20.

DSP芯片中浮点加法器LOD电路的设计

车德亮黄士坦刘军华唐威段来仓《微电子学与计算机》2003,20(4):60-62,65

DSP芯片中浮点加法器的速度制约着整个芯片的工作速度，浮点加法器中LOD电路的速度又是浮点加法器工作速度的瓶颈。因此，我们可以通过对LOD电路的改进，来提高整个DSP芯片的工作性能。我们从LOD的组成结构和逻辑两个方面进行设计，实现了一种快速、高效的LOD电路。它针对处理的数据格式为TMS320C3X扩展精度浮点数据格式。相似文献