首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Multiplication is one of the most basic arithmetic operations. It is used in digital applications, central processing units, and digital signal processors. In most systems, the multiplier lies within the critical path and hence, due to probability and reliability issues, the power consumption of the multiplier has become very important. Moreover, as chips shrink and their power densities increase, power is becoming a major concern for chip designers. The ever increasing demand for portable applications with their limited battery lifetime indicates that power considerations should be a center stone in today's designs and the future's designs. Thus, all this has motivated us to provide a novel circuit design technique for a low power multiplier without compromising the multiplier's speed. This paper presents a new power aware multiplier design based on Wallace tree structure. A new algorithm is proposed using high‐order counters to meet the power constraints imposed by mobility and shrinking technology. Commonly used multipliers of widths 8, 16, and 32 bits are designed based on the proposed algorithm. The new approach has succeeded in reducing the total number of gates used in the multiplier tree. Simulations on Altera's Quartus‐II FPGA simulator showed that the design achieves an average of 18.6% power reduction compared to the original Wallace tree. The design performs even better as the multiplier's size increases, achieving a 5% gate count reduction, a 26.5% power reduction, and a 23.9% better power‐delay product in 32‐bit multipliers. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

2.
We analyze the existing bi‐level IIR‐based bit‐stream multiplier and propose selection criteria for the key design parameter governing droop and phase linearity. Based on the proposed choice of parameter, we then extend the bi‐level design to tri‐ and quad‐level architectures that offer better signal‐to‐noise performance. Hardware complexity and noise performance of these designs are also contrasted with previously proposed FIR‐based bit‐stream multipliers. Useful design guidelines are subsequently drawn. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

3.
Quantum‐dot cellular automata (QCA) is an emerging technology with the rapid development of low‐power high‐performance digital circuits. In order to reduce the wire crossings and the number of logic gates in QCA circuits, this paper proposes a full adder named Tile full adder based on a 3 × 3 grid module, a Tile bit‐serial adder based on the new full adder and a Diverse Clock Tile bit serial adder (DC Tile bit‐serial) adder based on the new full adder and a DC multiplier network. Based on previously mentioned circuit units an improved carry flow adder (CFA) named Tile CFA and two types of carry delay multiplier (CDM) named Tile CDM and DC Tile CDM (DC Tile CDM) with different sizes are presented. All of the proposed QCA circuits are designed and simulated with QCADesigner. Simulation results show that these circuit designs not only implement the logic functions correctly but also achieve a significant performance improvement. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

4.
Schemes that implement finite impulse response (FIR) and infinite impulse response (IIR) digital filters when bit-serial or digit-serial arithmetic is used are proposed in this paper. The main objective is to obtain reduced latency (minimal latency at the word level) of the filter outputs while maintaining the word rate. Existing schemes (systolic or not) for filters are transferred down to the digit level and regular structures systolic at the bit or digit level are proposed. First a modified representation of a digital filter signal flow-graph appropriate for bit-serial or digit-serial arithmetic is presented. Next we show how the resulting flow-graph can be transformed to lead directly to a systolic implementation at the bit or word level. We aim towards minimizing the latency of the filter response. For this reason we work with bidirectional signal flow-graphs that lead to systolic arrays where data and partial results move in opposite directions, otherwise called two-way pipeline systolic arrays. The multipliers that are used in the implementation of the filters must have low latency themselves. For this reason they have the same two-way pipeline structure. In order to maintain the data word rate, the full-bit output of a multiplier must be rounded by a number of bits equal to the length of the data words. We propose a composite bit-serial multiplier that performs this rounding while preserving low latency and incorporate it in schemes for direct implementation of low-latency high-throughput systolic arrays for FIR and IIR digital filters. These schemes for bit-serial multipliers and filters are also extended to digit-serial arithmetic.  相似文献   

5.
32位快速乘法器设计   总被引:3,自引:0,他引:3  
本文介绍了一种通过符号位扩展,可以分别完成32位有符号/无符号二进制数乘法的高性能乘法器设计。该乘法器采用高基Booth算法,简化部分积的符号扩展,通过采用较之常规Wallace树具有更规则和更简洁的连接复杂度的阵列结构以及一种新型超前进位加法器来进一步提高乘法器的运算速度。整个设计采用4级流水线结构,在FPGA上进行了验证,并成功地应用于时/频联合均衡器工作中。  相似文献   

6.
In this paper, a new low-frequency digital sinusoidal oscillator with uniform-frequency spacing is proposed, and its performance is evaluated. The proposed oscillator structure, P, requires a single, short word-length multiplier for its hardware implementation. For the same frequency range, the single multiplier utilized can be implemented using an 8-bit word length, assuming that the multipliers in the previously reported oscillators, such as the multiple-output direct form digital oscillator and its modified versions, are implemented using 16 bits. The saving in multiplier word length results in a reduction in the silicon area and power dissipation. The proposed oscillator structure utilizes a multiple-output direct form digital oscillator with a single, short word-length multiplier in addition to a very simple switching and sign changing operation. The proposed oscillator structure is capable of generating sinusoidal signals with a large number of samples per cycle. The phase of the generated sinusoids is continuous at the switching points, and the difference between any two adjacent generated frequencies is approximately constant. Simulation results are presented to verify the analysis and demonstrate the performance as measured in terms of total harmonic distortion. It is evident from the simulation results that the performance of the proposed oscillator is essentially the same as those of the previously reported oscillators with less hardware complexity.  相似文献   

7.
This study proposes two ultraefficient imprecise multipliers based on innovative 4:2 approximate compressor designs. The first proposed multiplier employs an ultracompact 8‐transistor 4:2 compressor to reduce the transistor count and energy dissipation. To improve the accuracy of the first proposed imprecise multiplier, the second multiplier also benefits from a semiaccurate 24‐transistor 4:2 approximate compressor for the high‐order bits. The 7‐nm fin field‐effect transistor (FinFET) technology, as one of the leading commercial technologies, is utilized to simulate the proposed multipliers in the environment. The simulation results indicate that the proposed imprecise multipliers show significant improvements regarding transistor count, delay, power, and power‐delay product as compared to their state‐of‐the‐art imprecise and exact counterparts. Along with the superior hardware efficiency, the MATLAB simulations demonstrate that the proposed multipliers also provide reasonable levels of accuracy. Moreover, a figure of merit (FOM) considering hardware efficiency and output quality is considered in order to evaluate the multipliers comprehensively. The FOM simulation results indicate that the proposed imprecise multipliers make a significant trade‐off between hardware efficiency and quality for approximate‐computing applications dealing with image multiplication.  相似文献   

8.
Wave digital filters which are derived by previously known methods from LC ladder reference filters having attenuation poles at frequencies other than zero or infinity, although being canonic in number of multipliers, are not canonic in number of delays. A simple procedure is given by means of which a canonic realization can always be derived from an originally noncanonic ladder wave digital low-pass, high-pass, or band-pass filter. This procedure does not affect the values of the multiplier coefficients.  相似文献   

9.
T he main objective of this paper is to design and implement minimum multiplier, low latency structures of a comb filter. Multipliers are the most area and power consuming elements; therefore, it is desirable to realize a filter with minimum number of multipliers. In this paper, design of comb filters based on lattice wave digital filters (LWDF) structure is proposed to minimize the number of multipliers. The fundamental processing unit employed in LWDF requires only one multiplier. These lattice wave digital comb filters (LWDCFs) are realized using Richards' and transformed first‐order and second‐order all‐pass sections. The resulting structural realizations of LWDCFs exhibit properties such as low coefficient sensitivity, high dynamic range, high overflow level, and low round‐off noise. Multiplier coefficients of the proposed structures are implemented with canonic signed digit code (CSDC) technique using shift and add operations leading to multiplierless implementation. This contributes in reduction of number of addition levels which reduces the latency of the critical loop. A field programmable gate array (FPGA) platform is used for evaluation and testing of the proposed LWDCFs to acquire advantages of the parallelism, low cost, and low power consumption. The implementation of the proposed LWDCFs is accomplished on Xilinx Spartan‐6 and Virtex‐6 FPGA devices. By means of examples, it is shown that the implementations of the proposed LWDCFs attain high maximum sampling frequency, reduced hardware, and low power dissipation compared with the existing comb filter structures. Copyright © 2017 John Wiley & Sons, Ltd.  相似文献   

10.
This paper presents efficient and fast hardware implementations of the complete point multiplication on binary Edwards curves (BECs). The implementations are based on extremely fast complete differential addition and doubling formulas. These new complete differential addition formulas are performed for general and special cases of BECs with cost of 5 M + 4 S + 2 D and 5 M + 4 S + 1 D , respectively, where, M , S , and D denote the costs of a field multiplication, a field squaring and a field multiplication by a constant, respectively. In the general case of the BECs, proposed structures are implemented based on 3 and 1 pipelined digit‐serial Gaussian normal basis multipliers. In the design by 3 multipliers, computation of point addition and point doubling is performed concurrently. But in the second implementation for low‐cost design with low number of hardware resources, these computations are implemented by 1 multiplier. Also, in the special case of BECs, 2 structures are proposed for achieving the highest degree of parallelization and utilization of resources by using 3 and 2 field multipliers. Implementation results of the proposed architectures based on Virtex‐5 XC5VLX110, Virtex‐4 XC4VLX100, and Arria‐10 10AX115U4F45I3SG FPGAs for 2 fields and are achieved. The results show improvements in terms of execution time, area, and efficiency for the proposed structures compared with previous works.  相似文献   

11.
Bruce  J.W. 《Potentials, IEEE》2001,20(3):24-28
This article covers two popular types of Nyquist-rate digital-to-analog converters (DACs): the flash DAC and the serial DAC. Flash DACs perform their conversion in a single clock cycle and are typically designed to operate at high-speeds. Serial DACs convert the digital signal to an analog signal one bit at a time. Serial DACs trade the hardware complexity of a flash DAC for longer conversion times. In this article, three variations of flash DACs are introduced along with two serial DACs. Also the advantages and disadvantages for each architecture are discussed  相似文献   

12.
Real-time digital signal and image processing applications, such as filtering, demand high performance. Often, multiplication is one of the most time-consuming steps of the filtering operation. Log-based multipliers have been used for improving multiplication efficiency at the expense of accuracy. The objective of the proposed work is to improve the accuracy of log-based hardware multipliers by appropriately altering the filter weights and without increasing the required resources.  相似文献   

13.
A wave digital filter design method is discussed which produces good selectivity and flat delay characteristics. The filter is designed in the microwave domain with monotonic stopband response using Carlin and Wu's technique for minimum-phase linear phase commensurate line transducers, then realized in the digital domain with Fettweis relations. A comparison is made of frequency response sensitivity to multiplier coefficient truncation between this wave filter and some FIR's, all with approximately the same number of multipliers. The cascade realizations of the FIR's show considerable passband amplitude response deterioration due to coefficient truncation, whereas the direct realizations for the FIR's show significant deterioration of stopband amplitude response owing to coefficient truncation. Coefficient truncation for the wave digital filter caused relatively small deterioration of both passband and stopband amplitude responses as well as small distortion of the flat delay characteristic in the passband.  相似文献   

14.
陈畅  游宇 《电子测量技术》2007,30(11):47-50
本文介绍了逻辑信号源的原理和系统组成.阐述了系统平台的硬件设计及主要单元模块的功能,并对FPGA内部程序设计的主要思想和数据流程作了较详细介绍.仿真结果表明,系统可以输出包括SPI、IIC、USB、RS232、CAN等协议在内的多种串行数字码流,以及并行的数字码流. 测试结果表明,系统可以满足各种数字激励需求的数字信号,它们可以生成所需的1和0的码流,测试计算机总线、微处理器IC设备和其他数字系统.  相似文献   

15.
Power digital-to-analogue (D–A) converters have applications in digital audio, portable equipment and industrial control systems. This paper discusses the use of sigma–delta modulation as the primary processing block in power D–A systems. The focus is on the pulse repetition frequency (PRF) of the one-bit output and its suitability for power switching. It is shown that in order to preserve high power efficiency and tolerance to a non-ideal output stage, the PRF of the output may be reduced and made constant by the use of a technique termed bit flipping. The performance of different bit flipping algorithms is discussed which aim to regulate the PRF whilst maintaining the stability of the modulation process. Results are presented which compare the performance of the different systems under the conditions of an ideal and a non-ideal output stage. © 1997 by John Wiley & Sons, Ltd.  相似文献   

16.
小波网络在电力系统故障信号处理中的应用研究   总被引:1,自引:1,他引:1  
从小波分析中对函数逼近表示的不同角度,分别介绍了3种主要的小波网络,并对这3种小波网络的构成、网络模型和学习算法等进行了详细介绍和比较,给出了它们之间的本质区别。在此基础之上,对小波网络在电力系统故障信号分类和故障数据压缩方面的应用进行了讨论,给出的相应数字仿真结果表明,小波网络在电力系统故障信号处理方面的应用是完全可行的。  相似文献   

17.
Studies of the optimal multiplier (or optimal step size) modification to the standard Newton-Raphson (NR) load flow have mainly focused on highly stressed and unsolvable systems. This paper extends these previous studies by comparing performance of the NR load flow with and without optimal multipliers for a variety of unstressed, stressed, and unsolvable systems. Also, the impact of coordinate system choice in representing the voltage phasor at each bus is considered. In total, four solution methods are compared: the NR algorithm with and without optimal multipliers using polar and rectangular coordinates. This comparison is carried out by combining analysis of the optimal multiplier technique with empirical results for two-bus, 118-bus, and 10 274-bus test cases. These results indicate that the polar NR load flow with optimal multipliers is the best method of solution for both solvable and unsolvable cases.  相似文献   

18.
Bruce  J.W.  II. 《Potentials, IEEE》1998,17(5):36-39
To bring digital processing and its benefits to bear on real-world applications, the analog signal of interest must be translated into a format a digital computer can utilize. This is the function of the analog-to-digital converter (ADC). After processing by a digital computer or digital signal processor (DSP), the resulting digital stream of information must be returned to its analog form by a digital-to-analog converter (DAC). The methods by which a digital code is generated within the ADC are diverse. We introduce three popular Nyquist-rate ADC architectures used today: the counter ramp ADC, the successive approximation ADC and the flash ADC  相似文献   

19.
文中介绍8位RISC结构单片机中乘法器的设计方法,分析移位相加、加法器树、BOOTH编码—移位相加等多种乘法器的工作原理,并用Synopsys综合工具实现了这些乘法器。综合及仿真结果表明,根据该8位RISC结构单片机特点设计的BOOTH编码—移位相加乘法器较之其它类型乘法器速度提高很多,而面积仅比最小的移位相加乘法器增加不到18%,从速度和面积两方面综合考虑,是较好的设计方案。  相似文献   

20.
The advantages of a multiplier‐less architecture are reduction in hardware and latency. This paper proposes multiplier‐less architectures for the implementation of radix‐22 folded pipelined complex FFT core based on coordinate rotation digital computer (CORDIC) and new distributed arithmetic (NEDA). The number of points considered in the work is sixteen and the folding is done by a factor of four. The proposed designs have been implemented on Xilinx XC5VSX240T‐2FF1738 FPGA and also have been synthesized using the Synopsys design compiler. Proposed designs based on NEDA have reduced area over 83% and based on CORDIC have a reduced area over 78%. The observed slice‐delay product for NEDA based designs are 2.196 and 5.735, and for CORDIC based design is 2.369. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号