期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Srinivas H.R. Parhi K.K. 《Solid-State Circuits, IEEE Journal of》1992,27(5):761-767

An architecture for performing fixed-point, high-speed, two's-complement, bit-parallel addition by using the carry-free property of redundant arithmetic and a fast parallel redundant-to-binary conversion scheme is presented. The internal numbers are represented in radix-2 redundant digit form, and the inputs and the output of the adder are represented in two's-complement binary form. The adder operands are added first in a radix-2 redundant adder to produce the result in radix-2 digit (-1, 0, 1) form. This result is converted to two's-complement binary form using the parallel conversion scheme. The high-speed conversion for long words is achieved through the use of a novel sign-select operation. The proposed adder, referred to as the sign-select conversion adder, is faster than all previous high-speed two's-complement binary adders for large word lengths. The implementation is highly regular with repeated modules and is very well suited for VLSI implementation 相似文献

2.

高性能并行全冗余十进制乘法器的设计

下载免费PDF全文

张柳崔晓平董文雯《电子学报》2018,46(6):1519-1523

商业计算、金融分析等领域对高精度计算的需求对硬件十进制运算提出了越来越高的要求.已有的全冗余十进制乘法器由于全冗余加法器的结构复杂,已经给其性能的提升造成了瓶颈.本文优化设计了基于超载十进制数集（Overloaded Decimal Digit Set,ODDS）的全冗余ODDS加法器以降低其复杂度,并设计了一种新的基于该加法器的十进制压缩树模块.本文在部分积产生模块采用有符号的基-10编码和冗余的二-十进制（Binary Coded Decimal,BCD）编码快速产生十进制部分积.在最终积产生模块采用优化的编码转换电路快速产生BCD-8421乘积.实验结果显示所设计的并行全冗余十进制乘法器速度较快、面积较小. 相似文献

3.

基于忆阻器的乘法器电路设计

王光义沈书航刘公致李付鹏《电子与信息学报》2020,42(4):827-834

忆阻器作为一种非易失性的新型电路元件,在数字逻辑电路中具有良好的应用前景。目前,基于忆阻器的逻辑电路主要涉及全加器、乘法器以及异或(XOR)和同或(XNOR)门等研究,其中对于忆阻乘法器的研究仍比较少。该文采用两种不同方式来设计基于忆阻器的2位二进制乘法器电路。一种是利用改进的“异或”及“与”多功能逻辑模块,设计了一个2位二进制乘法器电路,另一种是结合新型的比例逻辑,即由一个忆阻器和一个NMOS管构成的单元门电路设计了一个2位二进制乘法器。对于所设计的两种乘法器进行了比较,并通过LTSPICS仿真进行验证。该文所设计的乘法器仅使用了2个N型金属-氧化物-半导体(NMOS)以及18个忆阻器(另一种为6个NMOS和28个忆阻器),相比于过去的忆阻乘法器,减少了大量晶体管的使用。相似文献

4.

A Radix-4 New Svobota-Tung Divider with Constant Timing Complexity for Prescaling

Jen-Shiun Chiang Min-Shiou Tsai 《The Journal of VLSI Signal Processing》2003,33(1-2):117-124

A new floating-point division architecture that complies with the IEEE 754-1985 standard is proposed in this paper. This architecture is based on the New Svoboda-Tung (NST) division algorithm and the radix-4 MROR (maximally redundant maximally recoded) signed digit number system. In NST division, the divisor and dividend must be prescaled. We summarize a general systematic method to accomplish the prescaling, and we also propose a hardware scheme such that the timing complexity is constant regardless of the bit length of the divisor. For the divider implementation, a new MROR signed digit adder with carry free characteristic is proposed for addition and subtraction, and this adder can improve the cycle time significantly. A 32-b/32-b radix-4 divider is thus designed in Verilog HDL; the simulation results show that this architecture is implementable using currently available libraries. The hardware complexity and performance of this divider is competitive with conventional SRT dividers. 相似文献

5.

基于FPGA的二值忆阻器仿真器研究及应用

下载免费PDF全文

周景张玮琦张露苗张章《微电子学》2023,53(1):75-80

基于FPGA的可重构性,提出了一种基于数字电路的二值忆阻器仿真器。与模拟电路忆阻器仿真器相比,所提出基于数字电路的忆阻器仿真器易于重新配置,与它所基于的数学模型表现出很好的匹配性,符合忆阻器仿真器所有要求的特点。实现了基于该仿真器的与门、或门、加法器及三人表决器。使用Altera Quartus II和ModelSim工具对仿真器功能和基于该仿真器实现的逻辑电路进行验证。给出所有设计电路的原理图、仿真结果和FPGA资源消耗。仿真结果表明,该二值忆阻器仿真器相比其他数字电路忆阻器仿真器具有更少的硬件资源消耗,更适合用于大规模忆阻器阵列研究。相似文献

6.

基于并行前缀结构的十进制加法器设计

王书敏崔晓平《电子科技》2016,29(6):19

针对硬件实现BCD码十进制加法需要处理无效码的问题,设计了一种基于并行前缀结构的十进制加法器。该十进制加法器依据预先加6,配合二进制加法求中间和,然后再减6修正的算法,并将减6修正步骤整合到重新设计的减6修正进位选择加法器中,充分利用并行前缀结构大幅提高了电路运算的并行度。采用Verilog HDL对加法器进行实现并利用Design Compiler进行综合,得到设计的32位,64位,128位的十进制加法器的延时分别为0.56 ns,0.61 ns,0.71 ns,面积分别为1 310 μm2,2 681 μm2,5 485 μm2。相似文献

7.

忆阻-CMOS混合模逆电路设计

高德志容源江先阳《信息技术》2020,(4):10-16,22

模逆运算是加密算法中最复杂的运算,更是最关键的模块之一。忆阻器是替代现有的晶体管从而延续摩尔定律的有力竞争者。文中结合信息安全和忆阻器两个领域的研究现状,将忆阻蕴含机制应用于模逆电路设计,研究忆阻器应用于大规模数字电路中的可行性和适应性。首先,基于FPGA平台提出忆阻蕴含逻辑电路模型,进而实现了基础逻辑门和加法器等功能模块;再调用功能模块,成功设计出了基于二进制扩展的Euclidean算法的忆阻-CMOS混合模逆电路。经仿真与验证,模逆模块在200MHz的时钟下能正确地执行设计功能。相似文献

8.

High-performance VLSI multiplier with a new redundant binary coding

Xiaoping Huang Belle W. Y. Wei Honglu Chen Yuhai H. Mao 《The Journal of VLSI Signal Processing》1991,3(4):283-291

This paper describes the design of a 16×16 redundant binary multiplier for signed 2's complement numbers. The multiplier uses a new coding scheme for representing radix-2 signed digits. The coding results in a factor of two reduction in the number of summands used with respect to the modified Booth algorithm. The design has a small number of modular cells and regular routing, making it suitable for automatic synthesis of larger data-width multipliers. In addition, the row-based redundant binary adder tree is an ideal structure for high-throughput applications.This work was supported in part by National Science Foundation grant MIP-9019862. 相似文献

9.

A fast final adder for a 54-bit parallel multiplier for DSP application

Subhendu Kumar Sahoo Chandra Shekhar 《International Journal of Electronics》2013,100(12):1625-1638

A novel redundant binary-to-natural binary converter circuit is proposed which is used in the final addition stage of parallel multipliers. Use of this circuit in the final adder stage proves to be 17% faster than carry-look-ahead implementation. We used this algorithm in such a way that no redundant binary adder is required in compressing partial product rows. Only the natural 4:2 compressor circuits are used. 相似文献

10.

A new carry-free division algorithm and its application to asingle-chip 1024-b RSA processor

Vandemeulebroecke A. Vanzieleghem E. Denayer T. Jespers P.G.A. 《Solid-State Circuits, IEEE Journal of》1990,25(3):748-756

A carry-free division algorithm is described. It is based on the properties of redundant signed digit (RSD) arithmetic to avoid carry propagation and uses the minimum hardware per bit, i.e. one full adder. Its application to a 1024-b RSA (Rivest, Shamir, and Adelman) cryptographic chip is presented. The features of this new algorithm allowed high performance (8 kb/s for 1024-b words) to be obtained for relatively small area and power consumption (80 mm² in a 2-μm CMOS process and 500 mW at 25 MHz) 相似文献

11.

A fully redundant decimal adder and its application in parallel decimal multipliers

Saeid Gorgin 《Microelectronics Journal》2009,40(10):1471-1481

Decimal hardware arithmetic units have recently regained popularity, as there is now a high demand for high performance decimal arithmetic. We propose a novel method for carry-free addition of decimal numbers, where each equally weighted decimal digit pair of the two operands is partitioned into two weighted bit-sets. The arithmetic values of these bit-sets are evaluated, in parallel, for fast computation of the transfer digit and interim sum. In the proposed fully redundant adder (VS semi-redundant ones such as decimal carry-save adders) both operands and sum are redundant decimal numbers with overloaded decimal digit set [0, 15]. This adder is shown to improve upon the latest high performance similar works and outperform all the previous alike adders. However, there is a drawback that the adder logic cannot be efficiently adapted for subtraction. Nevertheless, this adder and its restricted-input varieties are shown to efficiently fit in the design of a parallel decimal multiplier. The two-to-one partial product reduction ratio that is attained via the proposed adder has lead to a VLSI-friendly recursive partial product reduction tree. Two alternative architectures for decimal multipliers are presented; one is slower, but area-improved, and the other one consumes more area, but is delay-improved. However, both are faster in comparison with previously reported parallel decimal multipliers. The area and latency comparisons are based on logical effort analysis under the same assumptions for all the evaluated adders and multipliers. Moreover, performance correctness of all the adders is checked via running exhaustive tests on the corresponding VHDL codes. For more reliable evaluation, we report the result of synthesizing these adders by Synopsys Design Compiler using TSMC 0.13 μm standard CMOS process under various time constrains. 相似文献

12.

Redundant binary partial product generators for compact accumulation in Booth multipliers

Bijoy A. Jose 《Microelectronics Journal》2009,40(11):1606-1612

The use of signed-digit number systems in arithmetic circuits has the advantage of constant time addition. When signed-digit number systems are used in binary, they are referred as redundant binary. Here, we present a new encoding technique for generating redundant binary partial products for a multiplier, without using any additional hardware. We express each normal binary partial product in one's complement form, with an extra bit denoting the sign bit. The proposed redundant binary partial product generator (RBPPG) achieves the highest reduction in the number of partial products (75%) for a radix-4 multiplier. The carry-free nature of redundant binary adders is exploited to add the extra bits with the partial products, without using any extra adder stages. The new partial product generation (PPG) technique is shown to improve the speed of multipliers, with the least number of adder stages, irrespective of the multiplier size. 相似文献

13.

Hybrid low-latency serial-parallel multiplier architecture

Al-Besher B. Bouridane A. Ashur A.S. Crookes D. 《Electronics letters》1998,34(2):141-143

A novel low latency, most significant digit-first, signed digit multiplier architecture is presented. The design of the multiplier is based on a new 2 bit adder cell. Judicious deployment of latches in the circuit ensures that the multiplier operates on two coefficients of the multiplicand at the same time and produces one 2n digit product every 2n+3 cycles with an initial delay (latency) of three cycles. Comparison with existing multipliers has shown a superior performance of the proposed architecture 相似文献

14.

Redundant signed binary addition based digital-to-frequency converter

Chen W. Thornton M.A. Gui P. 《Electronics letters》2009,45(16):824-826

An accumulator-based digital-to-frequency (DFC) converter employing redundant signed binary addition (RSBA) is presented. RSBA is advantageous in that no carry propagation occurs resulting in constant delay regardless of operand word size. Utilising RSBA in the proposed DFC resolves the performance bottleneck in the DFC's conventional implementation and achieves extremely high frequency resolution. In addition, a new RSBA-based 8:1 multiplexer is introduced for a complete RSBA implementation of the DFC. Experimental results show an increase of more than 3.5 times in the speed of the accumulator compared to the conventional implementation regardless of bit size of the adder. 相似文献

15.

Weighted two-valued digit-set encodings: unifying efficient hardware representation schemes for redundant number systems

Jaberipur G. Parhami B. Ghodsi M. 《IEEE transactions on circuits and systems. I, Regular papers》2005,52(7):1348-1357

We introduce the notion of two-valued digit (twit) as a binary variable that can assume one of two different integer values. Posibits, or simply bits, in {0,1} and negabits in {-1,0}, commonly used in two's-complement representations and (n,p) encoding of binary signed digits, are special cases of twits. A weighted bit-set (WBS) encoding, which generalizes the two's-complement encoding by allowing one or more posibits and/or negabits in each radix-2 position, has been shown to unify many efficient implementations of redundant number systems. A collection of equally weighted twits, including ones with noncontiguous values (e.g., {-1,1} or {0,2}), can lead to wider representation range without the added storage and interconnection costs associated with multivalued digit sets. We present weighted twit-set (WTS) encodings as a generalization of WBS encodings, examine key properties of this new class of encodings, and show that any redundant number system (e.g., generalized signed-digit and hybrid-redundant systems), including those that are based on noncontiguous and/or zero-excluded digit sets, is faithfully representable by WTS encoding. We highlight this broad coverage by a tree chart having WTS representations at its root and various useful redundant representations at its many internal nodes and leaves. We further examine how highly optimized conventional components such as standard full/half-adders and compressors may be used for arithmetic on WTS-encoded operands, thus allowing highly efficient and VLSI-friendly circuit implementations. For example, focusing on the WBS-like subclass of WTS encodings, we describe a twit-based implementation of a particular stored-transfer representation which offers area and speed advantages over other similar designs based on WBS and hybrid-redundant representations. 相似文献

16.

Fully redundant decimal addition and subtraction using stored-unibit encoding

Amir Kaivani Author Vitae 《Integration, the VLSI Journal》2010,43(1):34-41

Decimal computer arithmetic is experiencing a revived popularity, and there is quest for high-performance decimal hardware units. Successful experiences on binary computer arithmetic may find grounds in decimal arithmetic. For example, the traditional fully redundant (i.e., the result and both of the operands are represented in a redundant format) and semi-redundant (i.e., the result and only one of the operands are redundant) binary addition schemes have influenced the design and implementation of similar decimal arithmetic units. However, special comparison and correction steps are required when decimal arithmetic algorithms are implemented on binary hardware. To circumvent these difficulties, alternative encodings of decimal digits and a variety of decimal arithmetic algorithms have been examined by many researchers over decades. In this paper we offer a new redundant decimal digit set [−8, 9] and a fully redundant addition/subtraction scheme. The proposed digit set, faithfully encoded as a mix of posibits, negabits, and unibits, is shown to obviate the need for any compare-to-9 operations and leads to minimal penalty subtraction using the addition circuitry. Moreover, conversion from the standard BCD encoding to the proposed stored-unibit encoding is possible with the latency of one logic level. However, the reverse conversion, like any other redundant to nonredundant conversion, involves carry propagation. 相似文献

17.

Constant-time addition with hybrid-redundant numbers: Theory and implementations

Ghassem Jaberipur Author Vitae Behrooz Parhami^{Author Vitae} 《Integration, the VLSI Journal》2008,41(1):49-64

Hybrid-redundant number representation has provided a flexible framework for digit-parallel addition in a manner that facilitates area-time tradeoffs for VLSI implementations via arbitrary spacing of redundant digit positions within an otherwise nonredundant representation. We revisit the hybrid redundancy scheme, pointing out limitations such as representational asymmetry, lack of representational closure in certain adder implementations, and difficulties in subtraction and carry acceleration. Given the intuitiveness of the hybrid redundancy concept and its potential for describing practically useful redundant number systems, we are motivated to extend it within the framework of weighted bit-set encodings to circumvent the aforementioned problems. The extension is based mainly on allowing negatively weighted bits (negabits), as well as standard posibits, to appear in nonredundant positions. Our extended hybrid redundancy scheme provides for arbitrary spacing of redundant positions in symmetric digit sets, without any degradation in arithmetic efficiency, while at the same time allowing low-latency subtraction by means of the same circuitry that is used for addition. Finally, we describe how inverted encoding of negabits leads to the exclusive use of unmodified standard full/half-adder, counter, and compressor cells, with no extra inverters, and to the direct applicability of conventional carry acceleration techniques in constant-time addition. 相似文献

18.

A high-accuracy approximate adder with correct sign calculation

《Integration, the VLSI Journal》2019

Conventional precise adders take long delay and large power consumption to obtain accurate results. Exploiting the error tolerance of some applications such as multimedia, image processing, and machine learning, a number of recent works proposed to design approximate adders that generate inaccurate results occasionally in exchange for reduction in delay and power consumption. However, most of the existing approximate adders have a large relative error. Besides, when applied to 2's complement signed addition, they sometimes generate a wrong sign bit. In this paper, we propose a novel approximate adder that exploits the generate signals for carry speculation. Furthermore, we introduce a low-overhead module to reduce the relative error and a sign correction module to fix the sign error. Compared to the conventional ripple carry adder and carry-lookahead adder, our adder with block size of 4 reduces power-delay product by 66% and 32%, respectively, for a 32-bit addition. Compared to the existing approximate adders, our adder significantly reduces the maximal relative error and ensures correct sign calculation with comparable area, delay, and power consumption. We further tested the performance of our adders with and without the sign error correction module in three real applications, mean filter, edge detection, and k-means clustering. The experimental results demonstrated the importance of reducing the relative error and ensuring the correct sign calculation for 2's complement signed additions. The outputs produced using our adder with the sign error correction module are very close to those produced using accurate adder. 相似文献

19.

High-speed complex-number multiplications based on redundant binary representation of partial products

Kyung-Wook Shin Heung-Woo Jeon 《International Journal of Electronics》2013,100(6):683-702

The complex-number multiplier is one of the key arithmetic components for the baseband signal processing of modern digital communication systems such as channel equalization, timing recovery, modulation and demodulation. This paper presents two algorithms suitable for a high-speed complex-number multiplier, which are based on redundant binary (RB) representation of partial products. The basic idea behind our approach is to convert a pair of binary partial products into a RB form so that the post-addition/subtraction which is inevitable in the conventional methods based on binary multiplication, is eliminated. With the proposed algorithms, the complex-number multiplication is reduced to two RB multiplications, one for the real part and the other for the imaginary part. The RB multiplication is defined by an addition of RB partial products, and is performed in parallel without carry propagation from the least-significant digit to the most-significant digit. This work results not only in simplified arithmetic operations, but also in highly parallel and simple architecture when compared with conventional methods using binary multiplications. To demonstrate the algorithms, two test chips have been implemented using a 0.8µm CMOS technology. 相似文献

20.

Towards an automated design flow for memristor based VLSI circuits

《Integration, the VLSI Journal》2020

As today's CMOS technology is gradually scaling down to its physical limits, emerging technologies are under research as alternatives in the future, such as carbon nanotube, magnetic tunneling junction, memristor. Among them, memristor is a promising candidate to implement the futuristic VLSI circuits. It provides a great scalability, near-zero standby power consumption, etc. In order to design memristor based VLSI circuits and explore their potential, it is crucial to develop an automated design flow. However, such a design flow is still missing so far. This paper proposes an automated design flow, Mosys by reusing parts of existing CMOS VLSI circuit design tools. Mosys provides a circuit design flow from a Verilog programming interface to performance estimation models. In addition, it employs a probabilistic power estimation model instead of one based on exhaustive-searching method. In our experiments, it significantly reduces the running time up to over 3000 times with a marginal error (<1%), as compared to the state-of-the-art. To verify the whole Mosys flow, several integer arithmetic functional units (e.g., add, multiply) are described in Verilog and implemented. In addition, Mosys is compared with the state-of-the-art using the EPFL benchmark suite. The results show that Mosys significantly improves the area (6.29x) and delay (4.68x) on average. 相似文献