期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

An 8.8-ns 54×54-bit multiplier with high speed redundantbinary architecture

Makino H. Nakase Y. Suzuki H. Morinaka H. Shinohara H. Mashiko K. 《Solid-State Circuits, IEEE Journal of》1996,31(6):773-783

A high speed redundant binary (RB) architecture, which is optimized for the fast CMOS parallel multiplier, is developed. This architecture enables one to convert a pair of partial products in normal binary (NB) form to one RE number with no additional circuit. We improved the RB adder (RBA) circuit so that it can make a fast addition of the RB partial products. We also simplified the converter circuit that converts the final RE number into the corresponding NE number. The carry propagation path of the converter circuit is carried out with only multiplexer circuits. A 54×54-bit multiplier is designed with this architecture. It is fabricated by 0.5 μm CMOS with triple level metal technology. The active area size is 3.0×3.08 mm² and the number of transistors is 78,800. This is the smallest number for all 54×54-bit multipliers ever reported. Under the condition of 3.3 V supply voltage, the chip achieves 8.8 ns multiplication time. The power dissipation of 540 mW is estimated for the operating frequency of 100 MHz. These are, so far, the fastest speed and the lowest power for 54×54-bit multipliers with 0.5-μm CMOS 相似文献

2.

A Novel Redundant Binary Number to Natural Binary Number Converter

S. K. Sahoo Anu Gupta Abhijit R. Asati Chandra Shekhar 《Journal of Signal Processing Systems》2010,59(3):297-307

Redundant binary number appears to be appropriate for high-speed arithmetic operation, but the delay and hardware cost associated with the conversion from redundant binary (RB) to natural binary (NB) number is still a challenging task. In the present investigation a simple approach has been adopted to achieve high speed with lesser hardware and power saving. A circuit level approach has been adopted to implement the equivalent bit conversion algorithm (EBCA) (Kim et al. IEEE Journal of Solid State Circuits 36:1538-1544, 2001, 38:159-160, 2003) for RB to NB conversion. The circuit is designed based on exploration of predictable carry out feature of EBCA algorithm. This implementation concludes a significant delay power product and component complexity advantage for a 64-bit RB to NB conversion using novel carry-look-ahead equivalent bit converter. 相似文献

3.

A New Redundant Binary Booth Encoding for Fast $2^{n}$-Bit Multiplier Design

《IEEE transactions on circuits and systems. I, Regular papers》2009,56(6):1192-1201

The use of redundant binary (RB) arithmetic in the design of high-speed digital multipliers is beneficial due to its high modularity and carry-free addition. To reduce the number of partial products, a high-radix-modified Booth encoding algorithm is desired. However, its use is hampered by the complexity of generating the hard multiples and the overheads resulting from negative multiples and normal binary (NB) to RB number conversion. This paper proposes a new RB Booth encoding scheme to circumvent these problems. The idea is to polarize two adjacent Booth encoded digits to directly form an RB partial product to avoid the hard multiple of high-radix Booth encoding without incurring any correction vector. The proposed method leads to lower encoding and decoding complexity than the recently proposed RB Booth encoder. Synthesis results using Artisan TSMC 0.18-$mu{hbox {m}}$ standard-cell library show that the RB multipliers designed with our proposed Booth encoding algorithm exhibit on average 14% higher speed and 17% less energy-delay product than the existing multiplication algorithms for a gamut of power-of-two word lengths from 8 to 64 b. 相似文献

4.

High-speed complex-number multiplications based on redundant binary representation of partial products

Kyung-Wook Shin Heung-Woo Jeon 《International Journal of Electronics》2013,100(6):683-702

The complex-number multiplier is one of the key arithmetic components for the baseband signal processing of modern digital communication systems such as channel equalization, timing recovery, modulation and demodulation. This paper presents two algorithms suitable for a high-speed complex-number multiplier, which are based on redundant binary (RB) representation of partial products. The basic idea behind our approach is to convert a pair of binary partial products into a RB form so that the post-addition/subtraction which is inevitable in the conventional methods based on binary multiplication, is eliminated. With the proposed algorithms, the complex-number multiplication is reduced to two RB multiplications, one for the real part and the other for the imaginary part. The RB multiplication is defined by an addition of RB partial products, and is performed in parallel without carry propagation from the least-significant digit to the most-significant digit. This work results not only in simplified arithmetic operations, but also in highly parallel and simple architecture when compared with conventional methods using binary multiplications. To demonstrate the algorithms, two test chips have been implemented using a 0.8µm CMOS technology. 相似文献

5.

A 10 ns 54×54 b parallel structured full array multiplierwith 0.5 μm CMOS technology

Mori J. Nagamatsu M. Hirano M. Tanaka S. Noda M. Toyoshima Y. Hashimoto K. Hayashida H. Maeguchi K. 《Solid-State Circuits, IEEE Journal of》1991,26(4):600-606

A 54 b×54 b multiplier fabricated in a double-metal 0.5 μm CMOS technology is described. The 54 b×54 b full array is adopted to complete multiplication within one latency. A 10 ns multiplication time is achieved by optimizing both the propagation time of the part consisting of 4-2 compressors and the propagation time of the final adder part. The n-channel pass-transistor circuit and the p-channel load circuit are used at the critical blocks to improve the multiplication speed. This multiplier is intended to be applied to double-precision floating-point data processing based on the IEEE standard up to clock range of 100 MHz 相似文献

6.

A remark on carry-free binary multiplication

Rulling W. 《Solid-State Circuits, IEEE Journal of》2003,38(1):159-160

It is shown that any binary multiplier needs some mechanism for carry propagation. As a consequence, the carry-free multiplier presented in the paper by Kim et al. (see IEEE J. Solid-State Circuits, vol. 36, p. 1538-1544, Oct. 2001) cannot work correctly. To demonstrate that fact, implementation-independent test patterns are constructed. 相似文献

7.

Comments on "A carry-free 54 b/spl times/54 b multiplier using equivalent bit conversion algorithm"

Ercegovac M.D. Lang T. Kim Y. Song B.-S. Grosspietsch J. Gillig S.F. 《Solid-State Circuits, IEEE Journal of》2003,38(1):160-161

For original paper see ibid., vol. 36, no. 10, p. 1538-1545 (Oct. 2001). In the aforementioned paper by Kim et al., a multiplier is presented which produces the result in radix-2 signed-digit representation. It is claimed that this representation can be converted into conventional magnitude representation by an algorithm which has no carry propagation. To the commenters this algorithm seems incorrect. The critical situation is a string which consists of a sequence of zeros followed by a -1; in such a case a carry is needed and the algorithm proposed is deemed incorrect. Consequently, it is pointed out that the proposed algorithm produces a correct multiplication result in conventional magnitude representation only if the signed-digit string does not have a sequence of 0's followed by a -1. The commenters show a multiplication example using the proposed conversion algorithm in which this situation occurs. 相似文献

8.

A 54×54-b regularly structured tree multiplier

Goto G. Sato T. Nakajima M. Sukemura T. 《Solid-State Circuits, IEEE Journal of》1992,27(9):1229-1236

A 54-b×54-b parallel multiplier was implemented in 0.88-μm CMOS using the new, regularly structured tree (RST) design approach. The circuit is basically a Wallace tree, but the tree and the set of partial-product-bit generators are combined into a recurring block which generates seven partial-product bits and compresses them to a pair of bits for the sum and carry signals. This block is used repeatedly to construct an RST block in which even wiring among blocks included in wire shifters is designed as recurring units. By using recurring wire shifters, the authors can expand the level of repeated blocks to cover the entire adder tree, which simplifies the complicated Wallace tree wiring scheme. In addition, to design time savings, layout density is increased by 70% to 6400 transistors/mm², and the multiplication time is decreased by 30% to 13 ns 相似文献

9.

A New Modular Exponentiation Architecture for Efficient Design of RSA Cryptosystem

《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2008,16(9):1151-1161

Modular exponentiation with a large modulus, which is usually accomplished by repeated modular multiplications, has been widely used in public key cryptosystems for secured data communications. To speed up the computation, the Montgomery modular multiplication algorithm is used to relax the process of quotient determination, and the carry-save addition (CSA) is employed to reduce the critical path delay. In this paper, based on the inherent data dependency between the modular multiplication and square operations in the H-algorithm of modular exponentiation, we present a new modular exponentiation architecture with a unified modular multiplication/square module and show how to reduce the number of input operands for the CSA tree by mathematical manipulation. The developed architecture has the following advantages. 1) There is no need to convert the carry-save form of an operand into its binary representation at the end of each modular multiplication. In this way, except the final step to get the result of modular exponentiation, the time-consuming carry propagation can then be eliminated. 2) The number of input operands for the CSA tree is reduced in a very efficient way. 3) The hardware saving is achieved with very limited impact on the original critical path delay when designed with two distinct modular multiplication and square components. Experimental results show that our modular exponentiation design obtains the least hardware complexity compared with the existing work and outperforms them in terms of area-time (AT) complexity as well. 相似文献

10.

A 4.4 ns CMOS 54×54-b multiplier using pass-transistormultiplexer

Ohkubo N. Suzuki M. Shinbo T. Yamanaka T. Shimizu A. Sasaki K. Nakagome Y. 《Solid-State Circuits, IEEE Journal of》1995,30(3):251-257

A 54×54-b multiplier using pass-transistor multiplexers has been fabricated by 0.25 μm CMOS technology. To enhance the speed performance, a new 4-2 compressor and a carry lookahead adder (CLA), both featuring pass-transistor multiplexers, have been developed. The new circuits have a speed advantage over conventional CMOS circuits because the number of critical-path gate stages is minimized due to the high logic functionality of pass-transistor multiplexers. The active size of the 54×54-b multiplier is 3.77×3.41 mm. The multiplication time is 4.4 ns at a 3.5-V power supply 相似文献

11.

Memristor based N-bits redundant binary adder

《Microelectronics Journal》2015,46(3):207-213

This paper introduces a memristor based N-bits redundant binary adder architecture for canonic signed digit code CSDC as a step towards memristor based multilevel ALU. New possible solutions for multi-level logic designs can be established by utilizing the memristor dynamics as a basis in the circuit realization. The proposed memristor-based redundant binary adder circuit tries to achieve the theoretical advantages of the redundant binary system, and to eliminate the carry (borrow) propagation using signed digit representation. The advantage of carry elimination in the addition process is that it makes the speed independent of the operands length which speeds up all arithmetic operations. One memristor is sufficient for both the addition process and for storing the final result as a memory cell. The adder operation has been validated via different cases for 1-bit and 3-bits addition using HP memristor model and PSPICE simulation results. 相似文献

12.

一种用于公钥密码系统的新型可变Radix快速乘法硬件算法

盖伟新《电子学报》1995,23(11):77-80

本文提出了一种新型的可变ｒａｄｉｘ快速乘法硬件算法，算法中，采用了二进制数的冗余数表示方法，使二个大数（大到５１２ｂｉｔ位或更大）的相加在Ｏ（１）时间内完成而无需等待进位；其次，提出了可变ｒａｄｉｘ快速乘法思想，使算法比ｒａｄｉｘ－４的乘法算法速度提高３３％，比ｒａｄｉｘ－８的乘法算法速度提高１１％而硬件实现更为简单，算法还能克服在较坏和最坏条件下，ｒａｄｉｘ－８乘法算法速度严重下降的缺陷，是一种可以作为核心运算有效地使用在许多公钥密码体制（如ＲＳＡ）硬件ＶＬＳＩ实现中的新型快速算法。相似文献

13.

A 4.1-ns compact 54×54-b multiplier utilizing sign-selectBooth encoders

Goto G. Inoue A. Ohe R. Kashiwakura S. Mitarai S. Tsuru T. Izawa T. 《Solid-State Circuits, IEEE Journal of》1997,32(11):1676-1682

A 54×54-b multiplier with only 60 K transistors has been fabricated by 0.25-μm CMOS technology. To reduce the total transistor count, we have developed two new approaches: sign-select Booth encoding and 48-transistor 4-2 compressor circuits both implemented with pass transistor logic. The sign-select Booth algorithm simplifies the Booth selector circuit and enables us to reduce the transistor count by 45% as compared with that of the conventional one. The new compressor reduces the count by 20% without speed degradation. By using these new circuits, the total transistor count of the multiplier is reduced by 24%. The active size of the 54×54-b multiplier is 1.04×1.27 mm and the multiplication time is 4.1 ns at a 2.5-V power supply 相似文献

14.

Improved memoryless RNS forward converter based on the periodicity of residues

Premkumar A.B. Ang E.L. Lai E.M.-K. 《Circuits and Systems II: Express Briefs, IEEE Transactions on》2006,53(2):133-137

The residue number system (RNS) is suitable for DSP architectures because of its ability to perform fast carry-free arithmetic. However, this advantage is over-shadowed by the complexity involved in the conversion of numbers between binary and RNS representations. Although the reverse conversion (RNS to binary) is more complex, the forward transformation is not simple either. Most forward converters make use of look-up tables (memory). Recently, a memoryless forward converter architecture for arbitrary moduli sets was proposed by Premkumar in 2002. In this paper, we present an extension to that architecture which results in 44% less hardware for parallel conversion and achieves 43% improvement in speed for serial conversions. It makes use of the periodicity properties of residues obtained using modular exponentiation. 相似文献

15.

A Power-Delay Efficient Hybrid Carry-Lookahead/Carry-Select Based Redundant Binary to Two's Complement Converter

Yajuan He Chip-Hong Chang 《IEEE transactions on circuits and systems. I, Regular papers》2008,55(1):336-346

This paper presents an efficient reverse converter for transforming the redundant binary (RB) representation into two's complement form. The hierarchical expansion of the carry equation for the reverse conversion algorithm creates a regular multilevel structure, from which a high-speed hybrid carry-lookahead/carry-select (CLA/CSL) architecture is proposed to fully exploit the redundancy of RB encoding for VLSI efficient implementation. The optimally designed CSL sections interleaved evenly in the mixed-radix CLA network to boost the performance of the reverse converter well above those designed based on a homogeneous type of carry propagation adder. The logical effort characterization captures the effect of circuit's fan-in, fan-out and transistor sizing on performance, and the evaluation shows that our proposed architecture leads to the fastest design. A 64-bit transistor-level circuit implementation of our proposed reverse converter and that of its most competitive contender were simulated to validate the logical effort delay model. The pre- and post-layout HSPICE simulation results reveal that our new converter expends at least two times less energy (power?delay product) than the competitor circuit and is capable of completing a 64-bit conversion in 829 ps and dissipates merely 5.84 mW at a data rate of 1 GHz and a supply voltage of 1.8 V in TSMC 0.18-$mu{hbox {m}}$ CMOS technology. 相似文献

16.

A new carry-free division algorithm and its application to asingle-chip 1024-b RSA processor

Vandemeulebroecke A. Vanzieleghem E. Denayer T. Jespers P.G.A. 《Solid-State Circuits, IEEE Journal of》1990,25(3):748-756

A carry-free division algorithm is described. It is based on the properties of redundant signed digit (RSD) arithmetic to avoid carry propagation and uses the minimum hardware per bit, i.e. one full adder. Its application to a 1024-b RSA (Rivest, Shamir, and Adelman) cryptographic chip is presented. The features of this new algorithm allowed high performance (8 kb/s for 1024-b words) to be obtained for relatively small area and power consumption (80 mm² in a 2-μm CMOS process and 500 mW at 25 MHz) 相似文献

17.

A fast VLSI adder architecture

Srinivas H.R. Parhi K.K. 《Solid-State Circuits, IEEE Journal of》1992,27(5):761-767

An architecture for performing fixed-point, high-speed, two's-complement, bit-parallel addition by using the carry-free property of redundant arithmetic and a fast parallel redundant-to-binary conversion scheme is presented. The internal numbers are represented in radix-2 redundant digit form, and the inputs and the output of the adder are represented in two's-complement binary form. The adder operands are added first in a radix-2 redundant adder to produce the result in radix-2 digit (-1, 0, 1) form. This result is converted to two's-complement binary form using the parallel conversion scheme. The high-speed conversion for long words is achieved through the use of a novel sign-select operation. The proposed adder, referred to as the sign-select conversion adder, is faster than all previous high-speed two's-complement binary adders for large word lengths. The implementation is highly regular with repeated modules and is very well suited for VLSI implementation 相似文献

18.

一款RSA模乘幂运算器的设计与实现 总被引：4，自引：1，他引：3

刘强佟冬程旭《电子学报》2005,33(5):923-927

通讯技术的高速发展需要更高性能的密码处理设备.本文介绍的RSA模乘幂运算器,采用蒙哥马利模乘法算法和指数的从右到左的二进制方法,并根据大整数模乘法运算和VLSI实现的要求进行改进,提供高速RSA模乘幂运算能力.该RSA运算器在其模乘法器中使用了进位保留加法器结构以避免长进位链.我们提出了信号多重备份的方法,解决大整数运算结构中关键信号广播带来的负载问题. 相似文献

19.

A 600-MHz 54×54-bit multiplier with rectangular-styledWallace tree

Itoh N. Naemura Y. Makino H. Nakase Y. Yoshihara T. Horiba Y. 《Solid-State Circuits, IEEE Journal of》2001,36(2):249-257

This paper presents an efficient layout method for a high-speed multiplier. The Wallace-tree method is generally used for high-speed multipliers. In the conventional Wallace tree, however, every partial product is added in a single direction from top to bottom. Therefore, the number of adders increases as the adding stage moves forward. As a result, it generates a dead area when the multiplier is laid out in a rectangle. To solve this problem, we propose a rectangular Wallace-tree construction method. In our method, the partial products are divided into two groups and added in the opposite direction. The partial products in the first group are added downward, and the partial products in the second group are added upward. Using this method, we eliminate the dead area. Also, we optimized the carry propagation between the two groups to realize high speed and a simple layout, We applied it to a 54×54-bit multiplier. The 980 μm×1000 μm area size and the 600 MHz clock speed have been achieved using 0.18 μm CMOS technology 相似文献

20.

A 200-MHz complex number multiplier using redundant binaryarithmetic

Kyung-Wook Shin Bang-Sup Song Bacrania K. 《Solid-State Circuits, IEEE Journal of》1998,33(6):904-909

Modern digital communication systems rely heavily on baseband signal processing for in-phase and quadrature (I-Q) channels, and complex number processing in low-voltage CMOS has become a necessity for channel equalization, timing recovery, modulation, and demodulation. In this work, redundant binary (RB) arithmetic is applied to complex number multiplication for the first time so that an N-bit parallel complex number multiplier can be reduced to two RE multiplications (i.e., an addition of N RB partial products) corresponding to real and imaginary parts, respectively. This efficient RE encoding scheme proposed can generate RB partial products with no additional hardware and delay overheads. A prototype 8-bit complex number multiplier containing 11.5 K transistors is integrated on 1.05×1.33 mm² using 0.8 μm CMOS. The chip consumes 90 mW with 2.5 V supply when clocked at 200 MHz 相似文献