首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 139 毫秒
1.
随着物联网的快速发展,智能终端设备在硬件资源和供电上受到较强限制,迫切需要低功耗的新型运算单元。针对运算单元功耗高的问题,提出了一种基于近似压缩器的低功耗近似乘法器,用于图像处理、深度学习等可容错应用领域。实验结果表明,相比于现有近似乘法器,该近似乘法器降低了30.70%的功耗和26.50%的延迟,节省了30.23%的芯片面积,在功耗延迟积(PDP)和能量延迟积(EDP)方面均优化了43%以上。在计算精度方面同样具有一定优势。最后,在图像滤波应用中验证了该近似乘法器的有效性。  相似文献   

2.
随着云计算、物联网和人工智能等技术的快速发展,终端设备在硬件资源和能耗上面临巨大挑战。为了降低运算单元的功耗,文章提出了两种基于新型4-1压缩器的低功耗近似乘法器。通过分析4-1压缩器的误差,设计了误差补偿单元并应用在乘法器中,降低了近似乘法器的精度损失。仿真结果显示,与精确乘法器相比,提出的两种8位无符号数近似乘法器在延时上分别降低了5.67%和18.23%,在面积上分别降低了6.54%和20.36%,在功耗上分别降低了15.83%和30.94%。最后,在图像锐化实验中,提出的设计表现优秀,验证了其在可容错应用中的有效性。  相似文献   

3.
浮点乘法器是高动态范围(HDR)图像处理、无线通信等系统中的关键运算单元,其相比于定点乘法器动态范围更广,但复杂度更高。近似计算作为一种新兴范式,在受限的精度损失范围内,可大幅降低硬件资源和功耗开销。该文提出一种16 bit半精度近似浮点乘法器(App-Fp-Mul),针对浮点乘法器中的尾数乘法模块,根据其部分积阵列中出现1的概率,提出一种对输入顺序不敏感的近似4-2压缩器及低位或门压缩方法,在精度损失较小的条件下有效降低了浮点乘法器资源及功耗。相较于精确设计,所提近似浮点乘法器在归一化平均错误距离(NMED)为0.0014时,面积及功耗延时积方面分别降低20%及58%;相较于现有近似设计,在近似位宽相同时具有更高的精度及更小的功耗延时积。最后将该文所提近似浮点乘法器应用于高动态范围图像处理,相比现有主流方案,峰值信噪比和结构相似性分别达到83.16 dB 和 99.9989%,取得了显著的提升。  相似文献   

4.
在Booth算法的基础上,结合MIPS 4KC微处理器中的流水线结构和乘法器的工作过程,提出了一种改进的Booth乘法器的设计方法,并采用全制定方法实现,用这种方法实现的乘法器单元具有面积小、单元电路可重复性好、版图设计工作量小、功耗低等特点.  相似文献   

5.
一种16×16位高速低功耗流水线乘法器的设计   总被引:1,自引:0,他引:1  
提出了一种16×16位的高速低功耗流水线乘法器的设计。乘法器结构采用Booth编码和Wallace树,全加器单元是一种新型的准多米诺逻辑,其性能较普通CMOS逻辑全加器有很大改善。使用0.5μmCMOS工艺模型,HSPICE模拟结果表明,在频率为150MHz条件下,电源电压3.0V,其平均功耗为11.74mW,延迟为6.5ns。  相似文献   

6.
徐锋  邵丙铣 《微电子学》2003,33(1):56-59
基于0.6μm双阱CMOS工艺模型,实现了一种高速低功耗16×16位并行乘法器。采用传输管逻辑设计电路结构,获得了低功耗的电路性能。采用改进的低功耗、快速Booth编码电路结构和4-2压缩器电路结构,它在2.5V工作电压下,运算时间达到7.18ns,平均功耗(100MHz)为9.45mW。  相似文献   

7.
实时信号处理系统要求数字信号处理器具有更高的速度和更低的功耗。文章提出的新型乘法累加器,具有在不同模式下分别处理16位与32位数据。或16位与32位数据混合运算能力。本运算结构采用由三个16位乘法器重构一个32位运算单元,可调用其中一至三个乘法累加模块处理不同精度的数据达到了高速度、低功耗的设计要求。在32位工作模式下数据处理速度可以达到16位乘累加器的水平。  相似文献   

8.
提出了一种低功耗可配置FFT处理器的设计方案和存储器地址产生方法,可进行8点、16点、32点、64点、128点和256点运算.采用基2算法和基于存储器的顺序结构,将长位宽的存储器分成两个短位宽的存储器,并在蝶形单元中将4个实数乘法器减少为3个,进一步降低了功耗.同时,在存储器读写和蝶形单元的运算之间采用流水线结构,以提高处理速度.该FFT处理器采用SMIC 0.18,um CMOS工艺库进行综合及布局布线,芯片核心面积为1.09 mm2,功耗仅为0.69 mW/MHz,实现了低功耗的目标.  相似文献   

9.
《现代电子技术》2016,(16):155-158
乘法器在数字信号处理系统中承担了很重要的作用,而乘法器消耗相当大的功耗,因此有必要进行乘法器的低功耗研究。介绍一种基于乘法累加(MAC)单元的FIR滤波器的设计,其中乘法器利用基4华莱士树乘法器,加法器利用超前进位加法器,在优化整合之后,得到低延时低功耗FIR滤波器。实验证明,该文设计的FIR滤波器具有很小的延时与很低的动态功耗。  相似文献   

10.
介绍了一种DSP专用高速乘法器的设计方法.该乘法器采用了最优化Booth编码算法,降低了部分乘积的数目,采用Wallace Tree最优化的演算法和快速超前进位加法器来进一步提高电路的运算速度.该乘法器在一个时钟周期内可以完成16位有符号/无符号二进制数乘法运算和复乘运算,在slow corner下最高频率可达220MHz以上.本乘法器是一DSP内核的专用乘法单元,整个设计简单高效.  相似文献   

11.
提出了一种基于静态分段补偿方法的近似乘法器。通过基于静态分段方法的Booth编码方法生成部分积阵列,并对生成的部分积阵列进行误差补偿优化以及近似压缩,以实现硬件性能和精度的折中。仿真结果显示,相比于综合工具生成的全精度乘法器,本设计在保持了较高精度水平的前提下,面积和功耗优化的比例达到了36.96%和35.95%。在图片边缘检测应用中,设计的峰值信噪比和结构相似性指标分别为26.10和98%,可见本设计在降低硬件资源消耗的同时,应用效果接近全精度乘法器。  相似文献   

12.
This paper proposes a dynamic error-compensated circuit for a fixed-width Booth multiplier based on the conditional probability of input series (CPIS), which enables high-speed operation and low circuit overhead. The dynamic compensated value is produced directly from the multiplier of input series simultaneously with the Booth encoder and therefore does not affect the critical path. The compensated formula is derived using a mathematical probability model, rather than time-consuming simulation. This formula is a function of bit-length of the multiplier; thus, the compensated circuit is easily implemented for bit-length of 32, 64, or longer. Accuracy-efficiency, which indicates the signal-to-noise ratio per unit area and unit delay, is included for ease of comparison. Compared with previous works, the greatest advantage of the proposed CPIS is high speed. Furthermore, the proposed CPIS achieves higher accuracy-efficiency. Implemented using the TSMC 0.18-\(\upmu \)m CMOS process, the proposed 32-bit Booth multiplier has an operation frequency of 50 MHz with power consumption of 7.3 mW.  相似文献   

13.
This work presents low-power 2's complement multipliers by minimizing the switching activities of partial products using the radix-4 Booth algorithm. Before computation for two input data, the one with a smaller effective dynamic range is processed to generate Booth codes, thereby increasing the probability that the partial products become zero. By employing the dynamic-range determination unit to control input data paths, the multiplier with a column-based adder tree of compressors or counters is designed. To further reduce power consumption, the two multipliers based on row-based and hybrid-based adder trees are realized with operations on effective dynamic ranges of input data. Functional blocks of these two multipliers can preserve their previous input states for noneffective dynamic data ranges and thus, reduce the number of their switching operations. To illustrate the proposed multipliers exhibiting low-power dissipation, the theoretical analyzes of switching activities of partial products are derived. The proposed 16 /spl times/ 16-bit multiplier with the column-based adder tree conserves more than 31.2%, 19.1%, and 33.0% of power consumed by the conventional multiplier, in applications of the ADPCM audio, G.723.1 speech, and wavelet-based image coders, respectively. Furthermore, the proposed multipliers with row-based, hybrid-based adder trees reduce power consumption by over 35.3%, 25.3% and 39.6%, and 33.4%, 24.9% and 36.9%, respectively. When considering product factors of hardware areas, critical delays and power consumption, the proposed multipliers can outperform the conventional multipliers. Consequently, the multipliers proposed herein can be broadly used in various media processing to yield low-power consumption at limited hardware cost or little slowing of speed.  相似文献   

14.

The approximate design has emerged as a revolutionary design paradigm to obtain energy efficient digital signal processing cores while exhibiting acceptable accuracy. In different signal processing architectures, multiplier is the prime arithmetic unit and significantly influences the performance of these cores. Therefore, four novel energy efficient rounding based approximate (RBA) multiplier architectures are proposed in this paper. These multipliers first approximate input operands to the nearest power of two values and then achieve multiplication using few adders and shifters. The proposed RBA multipliers significantly reduce implementation complexity and provide higher energy efficiency. Further, a novel reconfigurable rounding based approximate (RRBA) multiplier is proposed to achieve desired performance-quality tradeoff. Further, the performance of proposed RBA and RRBA multipliers is evaluated and analysed over the existing approximate multiplier architectures. The proposed 8-bit RBA0 requires 59.8% (54.7%) reduced area (delay) compared to the existing approximate multiplier. Finally, the efficacy of the proposed multipliers is demonstrated in the application by implementing Gaussian filters embedded with existing and proposed approximate multipliers. The Gaussian filter designed using RBA0 provides 32.5% reduced energy consumption over the filter with existing multiplier.

  相似文献   

15.
Wu  A. Ng  C.K. Tang  K.C. 《Electronics letters》1998,34(12):1179-1180
A pipelined modified Booth multiplication is proposed for low power, high performance DSP application. The proposed multiplication is suitable for VLSI implementation. It has a better power-performance ratio than the traditional pipelined multiplier and modified Booth multiplier  相似文献   

16.
We present a novel computation sharing multiplier architecture for two's complement numbers that leads to high performance digital signal processing systems with low power consumption. The computation sharing multiplier targets the reduction of power consumption by removing redundant computations within system by computation reuse. Use of computation sharing multiplier leads to high-performance finite impulse response (FIR) filtering operation by reusing optimal precomputations. The proposed computation sharing multiplier can be applicable to adaptive and nonadaptive FIR filter implementation. A decision feedback equalizer (DFE) was implemented based on the computation sharing multiplier in a 0.25-/spl mu/ technology as an example of an adaptive filter. The performance and power consumption of the DFE using a computation sharing multiplier is compared with that of DFEs using a Wallace-tree and a Booth-encoded multiplier. The DFE implemented with the computation sharing multiplier shows improvement in performance over the DFE using a Wallace-tree multiplier, reducing the power consumption significantly.  相似文献   

17.
This paper presents an error compensation method for a modified Booth fixed-width multiplier that receives a W-bit input and produces a W-bit product. To efficiently compensate for the quantization error, Booth encoder outputs (not multiplier coefficients) are used for the generation of error compensation bias. The truncated bits are divided into two groups depending upon their effects on the quantization error. Then, different error compensation methods are applied to each group. By simulations, it is shown that quantization error can be reduced up to 50% by the proposed error compensation method compared with the existing method with approximately the same hardware overhead in the bias generation circuit. It is also shown that the proposed method leads to up to 35% reduction in area and power consumption of a multiplier compared with the ideal multiplier.  相似文献   

18.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号