期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A fully redundant decimal adder and its application in parallel decimal multipliers

Saeid Gorgin 《Microelectronics Journal》2009,40(10):1471-1481

Decimal hardware arithmetic units have recently regained popularity, as there is now a high demand for high performance decimal arithmetic. We propose a novel method for carry-free addition of decimal numbers, where each equally weighted decimal digit pair of the two operands is partitioned into two weighted bit-sets. The arithmetic values of these bit-sets are evaluated, in parallel, for fast computation of the transfer digit and interim sum. In the proposed fully redundant adder (VS semi-redundant ones such as decimal carry-save adders) both operands and sum are redundant decimal numbers with overloaded decimal digit set [0, 15]. This adder is shown to improve upon the latest high performance similar works and outperform all the previous alike adders. However, there is a drawback that the adder logic cannot be efficiently adapted for subtraction. Nevertheless, this adder and its restricted-input varieties are shown to efficiently fit in the design of a parallel decimal multiplier. The two-to-one partial product reduction ratio that is attained via the proposed adder has lead to a VLSI-friendly recursive partial product reduction tree. Two alternative architectures for decimal multipliers are presented; one is slower, but area-improved, and the other one consumes more area, but is delay-improved. However, both are faster in comparison with previously reported parallel decimal multipliers. The area and latency comparisons are based on logical effort analysis under the same assumptions for all the evaluated adders and multipliers. Moreover, performance correctness of all the adders is checked via running exhaustive tests on the corresponding VHDL codes. For more reliable evaluation, we report the result of synthesizing these adders by Synopsys Design Compiler using TSMC 0.13 μm standard CMOS process under various time constrains. 相似文献

2.

高性能并行全冗余十进制乘法器的设计

下载免费PDF全文

张柳崔晓平董文雯《电子学报》2018,46(6):1519-1523

商业计算、金融分析等领域对高精度计算的需求对硬件十进制运算提出了越来越高的要求.已有的全冗余十进制乘法器由于全冗余加法器的结构复杂,已经给其性能的提升造成了瓶颈.本文优化设计了基于超载十进制数集（Overloaded Decimal Digit Set,ODDS）的全冗余ODDS加法器以降低其复杂度,并设计了一种新的基于该加法器的十进制压缩树模块.本文在部分积产生模块采用有符号的基-10编码和冗余的二-十进制（Binary Coded Decimal,BCD）编码快速产生十进制部分积.在最终积产生模块采用优化的编码转换电路快速产生BCD-8421乘积.实验结果显示所设计的并行全冗余十进制乘法器速度较快、面积较小. 相似文献

3.

A fast VLSI adder architecture

Srinivas H.R. Parhi K.K. 《Solid-State Circuits, IEEE Journal of》1992,27(5):761-767

An architecture for performing fixed-point, high-speed, two's-complement, bit-parallel addition by using the carry-free property of redundant arithmetic and a fast parallel redundant-to-binary conversion scheme is presented. The internal numbers are represented in radix-2 redundant digit form, and the inputs and the output of the adder are represented in two's-complement binary form. The adder operands are added first in a radix-2 redundant adder to produce the result in radix-2 digit (-1, 0, 1) form. This result is converted to two's-complement binary form using the parallel conversion scheme. The high-speed conversion for long words is achieved through the use of a novel sign-select operation. The proposed adder, referred to as the sign-select conversion adder, is faster than all previous high-speed two's-complement binary adders for large word lengths. The implementation is highly regular with repeated modules and is very well suited for VLSI implementation 相似文献

4.

基于逻辑操作和符号数字表达的光学并行负二进制运算

李国强刘立人钱家钧殷耀祖《中国激光》1997,24(7):659-664

研究了光学负二进制并行算术运算。基于符号数字负二进制表达，提出了任意字长操作数的并行两步加法与一步减法。这些基本运算都可通过空间编码与解码的光学逻辑实现，从而提供了一种有效的光学算术－逻辑单元（ＡＬＵ）设计方案。相似文献

5.

Constant-time addition with hybrid-redundant numbers: Theory and implementations

Ghassem Jaberipur Author Vitae Behrooz Parhami^{Author Vitae} 《Integration, the VLSI Journal》2008,41(1):49-64

Hybrid-redundant number representation has provided a flexible framework for digit-parallel addition in a manner that facilitates area-time tradeoffs for VLSI implementations via arbitrary spacing of redundant digit positions within an otherwise nonredundant representation. We revisit the hybrid redundancy scheme, pointing out limitations such as representational asymmetry, lack of representational closure in certain adder implementations, and difficulties in subtraction and carry acceleration. Given the intuitiveness of the hybrid redundancy concept and its potential for describing practically useful redundant number systems, we are motivated to extend it within the framework of weighted bit-set encodings to circumvent the aforementioned problems. The extension is based mainly on allowing negatively weighted bits (negabits), as well as standard posibits, to appear in nonredundant positions. Our extended hybrid redundancy scheme provides for arbitrary spacing of redundant positions in symmetric digit sets, without any degradation in arithmetic efficiency, while at the same time allowing low-latency subtraction by means of the same circuitry that is used for addition. Finally, we describe how inverted encoding of negabits leads to the exclusive use of unmodified standard full/half-adder, counter, and compressor cells, with no extra inverters, and to the direct applicability of conventional carry acceleration techniques in constant-time addition. 相似文献

6.

Weighted two-valued digit-set encodings: unifying efficient hardware representation schemes for redundant number systems

Jaberipur G. Parhami B. Ghodsi M. 《IEEE transactions on circuits and systems. I, Regular papers》2005,52(7):1348-1357

We introduce the notion of two-valued digit (twit) as a binary variable that can assume one of two different integer values. Posibits, or simply bits, in {0,1} and negabits in {-1,0}, commonly used in two's-complement representations and (n,p) encoding of binary signed digits, are special cases of twits. A weighted bit-set (WBS) encoding, which generalizes the two's-complement encoding by allowing one or more posibits and/or negabits in each radix-2 position, has been shown to unify many efficient implementations of redundant number systems. A collection of equally weighted twits, including ones with noncontiguous values (e.g., {-1,1} or {0,2}), can lead to wider representation range without the added storage and interconnection costs associated with multivalued digit sets. We present weighted twit-set (WTS) encodings as a generalization of WBS encodings, examine key properties of this new class of encodings, and show that any redundant number system (e.g., generalized signed-digit and hybrid-redundant systems), including those that are based on noncontiguous and/or zero-excluded digit sets, is faithfully representable by WTS encoding. We highlight this broad coverage by a tree chart having WTS representations at its root and various useful redundant representations at its many internal nodes and leaves. We further examine how highly optimized conventional components such as standard full/half-adders and compressors may be used for arithmetic on WTS-encoded operands, thus allowing highly efficient and VLSI-friendly circuit implementations. For example, focusing on the WBS-like subclass of WTS encodings, we describe a twit-based implementation of a particular stored-transfer representation which offers area and speed advantages over other similar designs based on WBS and hybrid-redundant representations. 相似文献

7.

Memristor based N-bits redundant binary adder

《Microelectronics Journal》2015,46(3):207-213

This paper introduces a memristor based N-bits redundant binary adder architecture for canonic signed digit code CSDC as a step towards memristor based multilevel ALU. New possible solutions for multi-level logic designs can be established by utilizing the memristor dynamics as a basis in the circuit realization. The proposed memristor-based redundant binary adder circuit tries to achieve the theoretical advantages of the redundant binary system, and to eliminate the carry (borrow) propagation using signed digit representation. The advantage of carry elimination in the addition process is that it makes the speed independent of the operands length which speeds up all arithmetic operations. One memristor is sufficient for both the addition process and for storing the final result as a memory cell. The adder operation has been validated via different cases for 1-bit and 3-bits addition using HP memristor model and PSPICE simulation results. 相似文献

8.

Decimal Division Algorithms: The Issue of Partial Remainders

Amir Kaivani Seok-Bum Ko 《Journal of Signal Processing Systems》2013,73(2):181-188

The efficiency of decimal digit-recurrence division algorithms is totally affected by the number representations of the quotient, the divisor and partial remainders participated in quotient digit selection (QDS). This paper establishes general rules and conditions for QDS with operands represented in the generalized signed-digit format. As a result of this generalization, a universal convergence condition is introduced which obviates the unnecessary conservatism of previous algorithms and hence paves the way for more correct and efficient decimal division hardware designs. It is also concluded that keeping the partial remainders in minimally redundant symmetric signed-digit representation (with digit-set [?5,6])and applying into QDS the divisor represented in minimally asymmetric non-redundant signed-digit format (with digit-set [?4,5]) lead to the smallest minimum precision required, of the divisor and the partial remainder, for QDS and thus faster and simpler division algorithm. Moreover, it is shown that even in case of using non-redundant partial remainders (for the sake of lower area cost); minimally asymmetric signed-digit representation brings about more efficiency. The suggested representations are applied to the fastest previous decimal digit-recurrence divider and 10 % speed-up is achieved while keeping the area cost approximately unaltered. 相似文献

9.

基于并行前缀结构的十进制加法器设计

王书敏崔晓平《电子科技》2016,29(6):19

针对硬件实现BCD码十进制加法需要处理无效码的问题,设计了一种基于并行前缀结构的十进制加法器。该十进制加法器依据预先加6,配合二进制加法求中间和,然后再减6修正的算法,并将减6修正步骤整合到重新设计的减6修正进位选择加法器中,充分利用并行前缀结构大幅提高了电路运算的并行度。采用Verilog HDL对加法器进行实现并利用Design Compiler进行综合,得到设计的32位,64位,128位的十进制加法器的延时分别为0.56 ns,0.61 ns,0.71 ns,面积分别为1 310 μm2,2 681 μm2,5 485 μm2。相似文献

10.

Redundant binary partial product generators for compact accumulation in Booth multipliers

Bijoy A. Jose 《Microelectronics Journal》2009,40(11):1606-1612

The use of signed-digit number systems in arithmetic circuits has the advantage of constant time addition. When signed-digit number systems are used in binary, they are referred as redundant binary. Here, we present a new encoding technique for generating redundant binary partial products for a multiplier, without using any additional hardware. We express each normal binary partial product in one's complement form, with an extra bit denoting the sign bit. The proposed redundant binary partial product generator (RBPPG) achieves the highest reduction in the number of partial products (75%) for a radix-4 multiplier. The carry-free nature of redundant binary adders is exploited to add the extra bits with the partial products, without using any extra adder stages. The new partial product generation (PPG) technique is shown to improve the speed of multipliers, with the least number of adder stages, irrespective of the multiplier size. 相似文献

11.

Further Reducing the Redundancy of a Notation Over a Minimally Redundant Digit Set

Marc Daumas David W. Matula 《The Journal of VLSI Signal Processing》2003,33(1-2):7-18

Redundant notations are used implicitly or explicitly in many digital designs. They have been studied in details and a general framework is known to reduce the redundancy of a notation down to the minimally redundant digit set. We present here an operator to further reduce the redundancy of such a representation. It does not reduce the number of allowed digits since removing one digit to a minimally redundant digit set is a conversion to a non redundant digit set and this is an expensive operation. Our operator introduces some correlation between the digits to reduce the number of possible redundant notations for any represented number. This reduction is visible in small useful operators like the elimination of leading zeros. We also present a key application with a CMOS Booth recoded multiplier. Our multiplier is able to accept both a redundant or a non redundant input with very little modifications and almost no penalty in time or space compared to state-of-the-art non redundant multipliers. 相似文献

12.

An Efficient Universal Addition Scheme for All Hybrid-Redundant Representations with Weighted Bit-Set Encoding

Ghassem Jaberipur Behrooz Parhami Mohammad Ghodsi 《The Journal of VLSI Signal Processing》2006,42(2):149-158

Redundant and hybrid-redundant number representations are used extensively to speed up arithmetic operations within general-purpose and special-purpose digital systems, with the latter (containing both redundant and nonredundant digits) offering cost advantages over fully redundant systems. We use weighted bit-set (WBS) encoding as a paradigm for uniform treatment of five previously studied variants of hybrid-redundant systems. We then extend the class of hybrid-redundant numbers to coincide with the entire set of canonical WBS numbers by allowing an arbitrary nonredundant position, heretofore restricted to ordinary bits (posibits), to hold a negatively weighted bit (negabit). This flexibility leads to interesting and useful symmetric variants of hybrid-redundant representations. We provide a high-level circuit design, based solely on binary full-adders, for a constant-time universal hybrid-redundant adder capable of producing a canonical WBS-encoded sum of two canonical WBS (or extended hybrid) numbers. This is made possible by the use of conventional binary full-adders for reducing any collection of three posibits and negabits, where negabits use an inverted encoding. We compare our adder to previous designs, showing advantages in speed, cost, and regularity. Furthermore we explore representationally closed addition schemes, holding the benefit of greater regularity and reusability, and provide high-level representationally closed designs for all the previously studied variants of hybrid redundancy and for the new symmetric variants introduced here. Finally, we present a new functionality for a conventional (4; 2) compressor in combining any collection of five equally weighted negabits and posibits, and show its utility in the design of multipliers for extended hybrid-redundant numbers. Ghassem Jaberipur received BS in electrical engineering and PhD in computer engineering from Sharif University of Technology in 1974 and 2004, respectively, MS in engineering (majoring in computer hardware) from University of California, Los Angeles, in 1976, and MS in computer science from University of Wisconsin, Madison, in 1979. Since 1979, he has been with the Department of Electrical and Computer Engineering, Shahid Beheshti University, in Tehran, Iran, teaching courses in compiler construction, automata theory, design and implementation of programming languages, and computer arithmetic. Behrooz Parhami (PhD, University of California, Los Angeles, 1973) is Professor of Electrical and Computer Engineering at University of California, Santa Barbara. He has research interests in computer arithmetic, parallel processing, and dependable computing. In his previous position with Sharif University of Technology in Tehran, Iran (1974--88), he was also involved in educational planning, curriculum development, standardization efforts, technology transfer, and various editorial responsibilities, including a five-year term as Editor of Computer Report, a Persian-language computing periodical. His technical publications include over 200 papers in peer-reviewed journals and international conferences, a Persian-language textbook, and an English/Persian glossary of computing terms. Among his publications are three textbooks on parallel processing (Plenum, 1999), computer arithmetic (Oxford, 2000), and computer architecture (Oxford, 2005). Dr. Parhami is a Fellow of both the IEEE and the British Computer Society, a member of the Association for Computing Machinery, and a Distinguished Member of the Informatics Society of Iran for which he served as a founding member and President during 1979-84. He also served as Chairman of IEEE Iran Section (1977-86) and received the IEEE Centennial Medal in 1984. Mohammad Ghodsi Mohammad Ghodsi received BS in electrical engineering from Sharif University of Technology (SUT, Tehran, Iran) in 1975, MS in electrical engineering and computer science from University of California at Berkeley in 1978, and PhD in computer science from the Pennsylvania State University in 1989. He has been affiliated with SUT as a faculty member since 1979. Presently, he is a Professor in SUT's Computer Engineering Department. His research interests include design of efficient algorithms, parallel and systolic algorithms, and computational geometry. 相似文献

13.

High-speed complex-number multiplications based on redundant binary representation of partial products

Kyung-Wook Shin Heung-Woo Jeon 《International Journal of Electronics》2013,100(6):683-702

The complex-number multiplier is one of the key arithmetic components for the baseband signal processing of modern digital communication systems such as channel equalization, timing recovery, modulation and demodulation. This paper presents two algorithms suitable for a high-speed complex-number multiplier, which are based on redundant binary (RB) representation of partial products. The basic idea behind our approach is to convert a pair of binary partial products into a RB form so that the post-addition/subtraction which is inevitable in the conventional methods based on binary multiplication, is eliminated. With the proposed algorithms, the complex-number multiplication is reduced to two RB multiplications, one for the real part and the other for the imaginary part. The RB multiplication is defined by an addition of RB partial products, and is performed in parallel without carry propagation from the least-significant digit to the most-significant digit. This work results not only in simplified arithmetic operations, but also in highly parallel and simple architecture when compared with conventional methods using binary multiplications. To demonstrate the algorithms, two test chips have been implemented using a 0.8µm CMOS technology. 相似文献

14.

High-speed VLSI arithmetic processor architectures using hybrid number representation

H. R. Srinivas Keshab K. Parhi 《The Journal of VLSI Signal Processing》1992,4(2-3):177-198

This paper addresses design of high speed architectures for fixed-point, two's-complement, bit-parallel division, square-root, and multiplication operations. These architectures make use of hybrid number representations (i.e. the input and output numbers are represented using two's complement representation, and the internal numbers are represented using radix-2 redundant representation). We propose newshifted remainder conditioning, andsign multiplexing techniques in combination with novel circuit architecture approaches to obtain efficient divider and square-root architectures. Our divider exploits full dynamic range of operands and eliminates the need for on-line or off-line conversion of the result to binary (this is because our nonrestoring division and square-root operators output binary quotient). Furthermore, since the binary input set is a subset of the redundant digit set, no binary-to-redundant number conversion is necessary at the input of the divider and square-root operators. We also present a fast, new conversion scheme for converting radix-2 redundant numbers to two's complement binary numbers, and use this to design a bit-parallel multiplier. This multiplier architecture requires fewer pipelining latches than conventional two's complement multipliers, and reduces the latency of the multiplication operation from (2W–1) to aboutW (whereW is the word-length), when pipelined at the bit-level.This research was supported by the Office of Naval Research under contract number N00014-J-91-1008. 相似文献

15.

一种基于简单移位的二——十进制相互转换算法 总被引：1，自引：0，他引：1

王迎春吉利久《电子学报》2003,31(2):221-224

十进制码(BCD)与二进制代码相互转换的问题的研究,主要偏重于软件实现.本文基于数制变换的基本原理,提出了移位为基础的、适合硬件实现的转换算法.并根据该算法,构造了63位二进制与十进制代码的转换器.同时,对该算法又进行了扩充,提出基2^<em>r移位的算法,进一步提高性能.从性能的比较可以看出,该算法速度高,逻辑简单,非常适合实时性要求较强的嵌入式领域应用. 相似文献

16.

A New Redundant Binary Booth Encoding for Fast $2^{n}$-Bit Multiplier Design

《IEEE transactions on circuits and systems. I, Regular papers》2009,56(6):1192-1201

The use of redundant binary (RB) arithmetic in the design of high-speed digital multipliers is beneficial due to its high modularity and carry-free addition. To reduce the number of partial products, a high-radix-modified Booth encoding algorithm is desired. However, its use is hampered by the complexity of generating the hard multiples and the overheads resulting from negative multiples and normal binary (NB) to RB number conversion. This paper proposes a new RB Booth encoding scheme to circumvent these problems. The idea is to polarize two adjacent Booth encoded digits to directly form an RB partial product to avoid the hard multiple of high-radix Booth encoding without incurring any correction vector. The proposed method leads to lower encoding and decoding complexity than the recently proposed RB Booth encoder. Synthesis results using Artisan TSMC 0.18-$mu{hbox {m}}$ standard-cell library show that the RB multipliers designed with our proposed Booth encoding algorithm exhibit on average 14% higher speed and 17% less energy-delay product than the existing multiplication algorithms for a gamut of power-of-two word lengths from 8 to 64 b. 相似文献

17.

B.C.D. serial adder/subtractor

Nicoud J.D. 《Electronics letters》1969,5(26):686-687

For the pure serial addition or subtraction of binary-coded decimal numbers, a simple network is proposed. It consists of a binary adder and a correction system using another adder. In an arithmetic unit, the position of this adder/subtractor at the beginning of the input/output register simplifies the design. 相似文献

18.

A Novel Decimal Logarithmic Converter Based on First-Order Polynomial Approximation

Dongdong Chen Seok-Bum Ko 《Circuits, Systems, and Signal Processing》2012,31(3):1179-1190

This paper presents a decimal logarithmic converter based on the decimal first-order polynomial (linear) approximation algorithm. The proposed approach is mainly based on a look-up table, followed a decimal linear approximation step. Compared with a binary-based decimal linear approximation algorithm (Algorithm 1), the proposed algorithm (Algorithm 2) is error-free in the conversion between the decimal and the binary formats. The proposed architecture is implemented by the combinational logic in the binary coded decimal (BCD) encoding on Virtex5 XC5VLX110T FPGA. The results of the comparison show that the hardware performance of Algorithm 2 can run 2.15 times faster than Algorithm 1, with the expense of 1.14 times more area. 相似文献

19.

Digit pipelined arithmetic on fine-grain array processors

Chetana Nagendra Robert Michael Owens Mary Jane Irwin 《The Journal of VLSI Signal Processing》1995,9(3):193-209

In this paper, we present a novel scheme for performing fixed-point arithmetic efficiently on fine-grain, massively parallel, programmable architectures including both custom and FPGA-based systems. We achieve anO(n) speedup, wheren is the operand precision, over the bit-serial methods of existing fine-grain systems such as the DAP, the MPP and the CM2, within the constraints of regular, near neighbor communication and only a small amount of on-chip memory. This is possible by means of digit pipelined algorithms which avoid broadcast and which operate in a fully systolic manner by pipelining at the digit level. A base 4, signed-digit, fully redundant number system and on-line techniques are used to limit carry propagation and minimize communication costs. p ]Although our algorithms are digit-serial, we are able to match the performance of the bit-parallel methods, while retaining low communication complexity. Reconfigurable hardware systems built using field programmable gate arrays (FPGA's) can share in the speed benefits of these algorithms. By using the organization of logic blocks suggested in this paper, problems of placement and routing that exist in such systems can be avoided. Since the algorithms are amenable to pipelining, very high throughput can be obtained. 相似文献

20.

An efficient maximum-redundancy radix-8 SRT division andsquare-root method

Hobson R.F. Fraser M.W. 《Solid-State Circuits, IEEE Journal of》1995,30(1):29-38

A new approach to integrating hardware multiplication, division, and square-root is presented. We use a fully integrated control path which simultaneously reduces part of the redundant partial-remainder and performs a truncated multiplication of the next quotient or square-root digit by the divisor or square-root value. A separate (parallel) full precision iterative multiplier is used for partial remainder production. Strategic details of a radix-8 implementation are discussed. It is shown that a maximally redundant digit set is a viable choice for high performance in this case 相似文献