期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Low-complexity bit-parallel systolic multipliers over GF(2)

Chiou-Yng Lee^{Author Vitae} 《Integration, the VLSI Journal》2008,41(1):106-112

This paper presents new time-dependent and time-independent multiplication algorithms over finite fields GF(2^m) by employing an interleaved conventional multiplication and a folded technique. The proposed algorithm allows efficient realization of the bit-parallel systolic multipliers. The results show that the proposed time-independent multiplier saves about 54% space complexity as compared to other related multipliers for polynomial and dual bases of GF(2^m). The proposed architectures include the features of regularity, modularity and local interconnection. Accordingly, it is well suited for VLSI implementation. 相似文献

2.

An efficient reconfigurable multiplier architecture for Galois field GF(2)

P Kitsos G TheodoridisO Koufopavlou 《Microelectronics Journal》2003,34(10):975-980

This paper describes an efficient architecture of a reconfigurable bit-serial polynomial basis multiplier for Galois field GF(2^m), where 1<m≤M. The value m, of the irreducible polynomial degree, can be changed and so, can be configured and programmed. The value of M determines the maximum size that the multiplier can support. The advantages of the proposed architecture are (i) the high order of flexibility, which allows an easy configuration for different field sizes, and (ii) the low hardware complexity, which results in small area. By using the gated clock technique, significant reduction of the total multiplier power consumption is achieved. 相似文献

3.

Novel Radix Finite Field Multiplier for GF(2m)

Mekhallalati M.C. Ashur A.S. Ibrahim M.K. 《Journal of Signal Processing Systems》1997,15(3):233-245

In this paper, a new High-Radix Finite Field multiplication algorithm for GF(2^m) is proposed for the first time. The proposed multiplication algorithm can operate in a Digit-serial fashion, and hence can give a trade-off between the speed, the area , the input/output pin limitation, and the low power consumption by simply varying the digit size. A detailed example of a new Radix-16 GF(2^m) Digit-Serial multiplication architecture adopting the proposed algorithm illustrates a speed improvement of 75% when compared to conventional Radix-2 bit-serial realization. This is made more significant when it is noted that the speed improvement of 75% was achieved at the expense of only 2.3 times increase in the hardware requirements of the proposed architecture. 相似文献

4.

An Efficient Look-up Table-based Approach for Multiplication over GF(2 m ) Generated by Trinomials

Bimal K. Meher Pramod K. Meher 《Circuits, Systems, and Signal Processing》2013,32(6):2623-2638

In this paper, we present an efficient look-up table (LUT)-based approach to design multipliers for GF(2^m) generated by irreducible trinomials. A straightforward LUT-based multiplication requires a table of size (m×2^m) bits for the Galois field of degree m. The LUT size, therefore, becomes quite large for the fields of large degrees recommended by the National Institute of Standards and Technology (NIST). Keeping that in view, we have proposed a digit-serial LUT-based design, where operand bits are grouped into digits of fixed width, and multiplication is performed in serial/parallel manner. We restrict the digit size to 4 to store only 16 words in the LUT to have lower area-delay complexity. We have also proposed a digit-parallel LUT-based design for high-speed applications, using the same LUT as the digit-serial design, at the cost of some additional multiplexors and combinational logic for parallel modular reductions and additions. We have presented a simple circuit for the initialization of LUT content, which can be used to update the LUT in three cycles whenever required. The proposed digit-serial design involves less area-complexity and less time-complexity than those of the existing LUT-based designs. The proposed digit-parallel design offers nearly 28 % improvement in area-delay product over the best of the existing LUT-based designs. NIST has recommended five binary finite fields for elliptic curve cryptography, out of which two are generated by the trinomials Q(x)=x ²³³+x ⁷⁴+1 and Q(x)=x ⁴⁰⁹+x ⁸⁷+1. In this paper, we have designed a reconfigurable multiplier that can be used for both these fields. The proposed reconfigurable multiplier is shown to have a negligible reconfiguration overhead and would be useful for cryptographic applications. 相似文献

5.

基于脉动阵列结构的多项式基乘法器的设计

庄建忠艾树峰《电讯技术》2013,53(8):1049-1051

提出了一类基于脉动阵列结构的字串行有限域乘法器架构。架构基于多项式基,支持m相似文献

6.

New Bit-Parallel Systolic Architectures for Computing Multiplication,Multiplicative Inversion and Division in GF(2<Superscript>m</Superscript>) Under Polynomial Basis and Normal Basis Representations

Chiou-Yng Lee Che Wun Chiou 《Journal of Signal Processing Systems》2008,52(3):313-324

A new bit-parallel systolic multiplier over GF(2^m) under the polynomial basis and normal basis is proposed. This new circuit is constructed by m ² identical cells, each of which consists of one two-input AND gate, one three-input XOR gate and five 1-bit latches. Especially, the proposed architecture is without the basis conversion as compared to the well-known multipliers with the redundant representation. With this proposed multiplier, a parallel-in parallel-out systolic array has also been developed for computing inversion and division over GF(2^m). The proposed architectures are well suited to VLSI systems due to their regular interconnection pattern and modular structure.

Che Wun ChiouEmail:

相似文献

7.

Scalable Gaussian Normal Basis Multipliers over GF(2<Superscript><Emphasis Type="Italic">m</Emphasis></Superscript>) Using Hankel Matrix-Vector Representation

Chiou-Yng Lee Che Wun Chiou 《Journal of Signal Processing Systems》2012,69(2):197-211

This work presents a novel scalable multiplication algorithm for a type-t Gaussian normal basis (GNB) of GF(2^m). Utilizing the basic characteristics of MSD-first and LSD-first schemes with d-bit digit size, the GNB multiplication can be decomposed into n(n + 1) Hankel matrix-vector multiplications. where n = (mt + 1)/d. The proposed scalable architectures for computing GNB multiplication comprise of one d × d Hankel multiplier, four registers and one final reduction polynomial circuit. Using the relationship of the basis conversion from the GNB to the normal basis, we also present the modified scalable multiplier which requires only nk Hankel multiplications, where k = mt/2d if m is even or k = (mt − t + 2)/2d if m is odd. The developed scalable multipliers have the feature of scalability. It is shown that, as the selected digit size d ≥ 8, the proposed scalable architectures have significantly lower time-area complexity than existing digit-serial multipliers. Moreover, the proposed architectures have the features of regularity, modularity, and local interconnection ability. Accordingly, they are well suited for VLSI implementation. 相似文献

8.

LFSR multipliers over GF(2) defined by all-one polynomial

Hyun-Sung Sung-Woon 《Integration, the VLSI Journal》2007,40(4):473-478

This paper presents two bit-serial modular multipliers based on the linear feedback shift register using an irreducible all one polynomial (AOP) over GF(2^m). First, a new multiplication algorithm and its architecture are proposed for the modular AB multiplication. Then a new algorithm and architecture for the modular AB² multiplication are derived based on the first multiplier. They have significantly smaller hardware complexity than the previous multipliers because of using the property of AOP. It simplifies the modular reduction compared with the case of using the generalized irreducible polynomial. Since the proposed multipliers have low hardware requirements and regular structures, they are suitable for VLSI implementation. The proposed multipliers can be used as the kernel architecture for the operations of exponentiation, inversion, and division. 相似文献

9.

有限域GF (2^m)上的一个新的求逆算法

徐大专许宗泽《电子与信息学报》1998,20(6):771-774

根据有限域GF(2^m)上的正规基表示和Massey-Omura乘法器,本文提出了一个复杂性为O(logm)的求逆算法。新算法完成一次求逆运算只需要[log₂(m-1)]+w(m-1)-1次乘法和m-1次循环移位,这里[x]表示小于等于x的最大整数,w(m-1)表示m-1的二进制表示中“1”的个数。相似文献

10.

Concurrent error detection architectures for Gaussian normal basis multiplication over GF(2)

Chiou-Yng Lee^{Author Vitae} 《Integration, the VLSI Journal》2010,43(1):113-123

This paper presents a method of using a parity prediction scheme for detecting erroneous outputs in bit-parallel, sequential, and digit-serial Gaussian normal basis (GNB) multipliers over GF(2^m). Although all-type NB multipliers have different time and space complexities, our analytical results indicate that all-type GNB multipliers have the same structure if they use parity prediction function. For example, in the field GF(2²³³), we have estimated that the error detection rate for a sequential multiplier is nearly 100% if a comparison is made as per clock cycle. Our analytical results also show that the area overhead of the proposed digit-serial multiplier with concurrent error detection does not exceed 5%. Several efficient parity prediction techniques will be shown in this work to provide a low overhead solution to concurrent error detection particularly when the cryptography implementations using GF(2^m) multiplier require higher reliability and the protection against adversarial attacks. 相似文献

11.

Novel Radix Finite Field Multiplier for GF(2^m)

M.C. Mekhallalati A.S. Ashur M.K. Ibrahim 《The Journal of VLSI Signal Processing》1997,15(3):233-245

In this paper, a new High-Radix Finite Field multiplication algorithm for GF(2^m) is proposed for the first time. The proposed multiplication algorithm can operate in a Digit-serial fashion, and hence can give a trade-off between the speed, the area , the input/output pin limitation, and the low power consumption by simply varying the digit size. A detailed example of a new Radix-16 GF(2^m) Digit-Serial multiplication architecture adopting the proposed algorithm illustrates a speed improvement of 75% when compared to conventional Radix-2 bit-serial realization. This is made more significant when it is noted that the speed improvement of 75% was achieved at the expense of only 2.3 times increase in the hardware requirements of the proposed architecture. 相似文献

12.

Low Power Semi-systolic Architectures for Polynomial-Basis Multiplication over GF(2m) Using Progressive Multiplier Reduction

Atef Ibrahim Fayez Gebali 《Journal of Signal Processing Systems》2016,82(3):331-343

We present low area and low power semi-systolic array architectures for polynomial basis multiplication over GF(2^m) using Progressive Multiplier Reduction Technique (PMR). These architectures are explored using linear and nonlinear techniques applied to the polynomial multiplication algorithm. The nonlinear techniques allow the designer, to control the processor workload and reduce the inter-processor communications. The semi-systolic architectures obtained have simple structure with local communication. ASIC implementations of our designs and comparable published designs show that the proposed scalable semi-systolic structures have less area complexity (56.8–94.6 %) and power consumption (55.2–84.2 %) except for a scalable design published by the same authors. However, one of the proposed scalable designs outperforms this design in terms of throughput by 73.8 %. This makes the proposed designs suited to embedded applications that require low power consumption and moderate speed. 相似文献

13.

Improved Scalar Multiplication on Elliptic Curves Defined over F2mn

Dong Hoon Lee Seongtaek Chee Sang Cheol Hwang Jae‐Cheol Ryou 《ETRI Journal》2004,26(3):241-251

We propose two improved scalar multiplication methods on elliptic curves over F_qn where q = 2^m using Frobenius expansion. The scalar multiplication of elliptic curves defined over subfield F_q can be sped up by Frobenius expansion. Previous methods are restricted to the case of a small m. However, when m is small, it is hard to find curves having good cryptographic properties. Our methods are suitable for curves defined over medium‐sized fields, that is, 10 ≤ m ≤ 20. These methods are variants of the conventional multiple‐base binary (MBB) method combined with the window method. One of our methods is for a polynomial basis representation with software implementation, and the other is for a normal basis representation with hardware implementation. Our software experiment shows that it is about 10% faster than the MBB method, which also uses Frobenius expansion, and about 20% faster than the Montgomery method, which is the fastest general method in polynomial basis implementation. 相似文献

14.

Basis Translation Matrix between Two Isomorphic Extension Fields via Optimal Normal Basis

Yasuyuki Nogami Ryo Namba Yoshitaka Morikawa 《ETRI Journal》2008,30(2):326-334

This paper proposes a method for generating a basis translation matrix between isomorphic extension fields. To generate a basis translation matrix, we need the equality correspondence of a basis between the isomorphic extension fields. Consider an extension field F_p^m where p is characteristic. As a brute force method, when p^m is small, we can check the equality correspondence by using the minimal polynomial of a basis element; however, when p^m is large, it becomes too difficult. The proposed methods are based on the fact that Type I and Type II optimal normal bases (ONBs) can be easily identified in each isomorphic extension field. The proposed methods efficiently use Type I and Type II ONBs and can generate a pair of basis translation matrices within 15 ms on Pentium 4 (3.6 GHz) when mlog₂ p = 160. 相似文献

15.

Utilization of Pipeline Technique in AOP Based Multipliers with Parallel Inputs

Morteza Nikooghadam Ali Zakerolhosseini 《Journal of Signal Processing Systems》2013,72(1):57-62

Heretofore many All-One-Polynomials (AOP) based multipliers are proposed over GF(2^m). Previously proposed multipliers have serial input structure and also suffer from a long critical path delay. In this paper we improve AOP based multipliers by reducing the critical path delay and changing the input structure to parallel. Initially, we modify the wiring of the previously proposed AOP based multipliers. This approach reduces the critical path delay from O(m) to O(log m). In order to further reduce this delay from O(log m) to O(1) the pipeline technique is utilized. The efficiency of the proposed architectures is evaluated based on criteria of time (latency, critical path) and space complexity (gate-latch number). 相似文献

16.

An efficient and high-speed VLSI implementation of optimal normal basis multiplication over GF(2m)

《Integration, the VLSI Journal》2016

Finite field multiplication is one of the most important operations in the finite field arithmetic and the main and determining building block in terms of overall speed and area in public key cryptosystems. In this work, an efficient and high-speed VLSI implementation of the bit-serial, digit-serial and bit-parallel optimal normal basis multipliers with parallel-input serial-output (PISO) and parallel-input parallel-output (PIPO) structures are presented. Two general multipliers, namely, Massey–Omura (MO) and Reyhani Masoleh–Hassan (RMH) are considered as case study for implementation. These multipliers are constructed by using AND, XOR–AND and XOR tree components. In the MO multiplier, to have strong input signals and have a better implementation, the row of AND gates are implemented by using inverter and NOR components. Also the XOR–AND component in the RMH structure is implemented using a new low-cost structure. The XOR tree in both multipliers consists of a high number of logic stages and many inputs; therefore, to optimally decrease the delay and increase the drive ability of the circuit for different loads, the logical effort method is employed as an efficient method for sizing the transistors. The multipliers are first designed for different load capacitances using different structures and different number of stages. Then using the logical effort method and a new proposed 4-input XOR gate structure, the circuits are modified for acquiring minimum delay. Using 0.18 μm CMOS technology, the bit-serial, digit-serial and bit-parallel structures with type-1 and type-2 optimal normal basis are implemented over the finite fields GF(2²²⁶) and GF(2²³³) respectively. The results show that the proposed structures have better delay and area characteristics compared to previous designs. 相似文献

17.

High-Speed Array Multipliers Based on On-the-Fly Conversion

Sang-Man Moh Suk-Han Yoon 《ETRI Journal》1997,19(4):317-325

A new on-the-fly conversion algorithm is proposed, and high-speed array multipliers with the on-the-fly conversion are presented. The new on-the-fly conversion logic is used to speed up carry-propagate addition at the last stage of multiplication, and provides constant delay independent of the number of input bits. In this paper, the multiplication architecture and the on-the-fly conversion algorithm are presented and discussed in detail. The proposed architecture has multiplication time of (n + 1)t_FA, where n is the number of input bits and t_FA is the delay of a full adder. According to our comparative performance evaluation, the proposed architecture has shorter delay and requires less area than the conventional array multiplier with on-the-fly conversion. 相似文献

18.

An Efficient Method for Evaluating Complex Polynomials

Miloš D. Ercegovac Jean-Michel Muller 《Journal of Signal Processing Systems》2010,58(1):17-27

We propose an efficient hardware-oriented method for evaluating complex polynomials. The method is based on solving iteratively a system of linear equations. The solutions are obtained digit-by-digit on simple and highly regular hardware. The operations performed are defined over the reals. We describe a complex-to-real transform, a complex polynomial evaluation algorithm, the convergence conditions, and a corresponding design and implementation. The latency and the area are estimated for the radix-2 case. The main features of the method are: the latency of about m cycles for an m-bit precision; the cycle time independent of the precision; a design consisting of identical modules; and digit-serial connections between the modules. The number of modules, each roughly corresponding to serial-parallel multiplier without a carry-propagate adder, is 2(n?+?1) for evaluating an n-th degree complex polynomial. The method can also be used to compute all successive integer powers of the complex argument with the same latency and a similar implementation cost. The design allows straightforward tradeoffs between latency and cost: a factor k decrease in cost leads to a factor k increase in latency. A similar tradeoff between precision, latency and cost exists. The proposed method is attractive for programmable platforms because of its regular and repetitive structure of simple hardware operators. 相似文献

19.

Concurrent Error Detection in Multiplexer-Based Multipliers for Normal Basis of GF(2<Superscript><Emphasis Type="Italic">m</Emphasis></Superscript>) Using Double Parity Prediction Scheme

Chiou-Yng Lee Che Wun Chiou Jim-Min Lin 《Journal of Signal Processing Systems》2010,58(2):233-246

Successful implementation of elliptic curve cryptographic systems primarily depends on the efficient and reliable arithmetic circuits for finite fields with very large orders. Thus, the robust encryption/decryption algorithms are elegantly needed. Multiplication would be the most important finite field arithmetic operation. It is much more complex compared to the finite field addition. It is also frequently used in performing point operations in elliptic curve groups. The hardware implementation of a multiplication operation may require millions of logic gates and may thus lead to erroneous outputs. To obtain reliable cryptographic applications, a novel concurrent error detection (CED) architecture to detect erroneous outputs in multiplexer-based normal basis (NB) multiplier over GF(2^m) using the parity prediction scheme is proposed in this article. Although various NB multipliers, depending on \( \alpha \alpha^{{2^i }} = \sum\limits_{j = 0}^{m - 1} {t_{i,j} } \alpha^{{2^j }} \), have different time and space complexities, NB multipliers will have the same structure if they use a parity prediction function. By using the structure of the proposed CED NB multiplier, a CED scalable multiplier over composite fields with 100% error detection rate is also presented. 相似文献

20.

Low-Energy Digit-Serial/Parallel Finite Field Multipliers 总被引：5，自引：0，他引：5

Leilei Song Keshab K. Parhi 《The Journal of VLSI Signal Processing》1998,19(2):149-166

Digit-serial architectures are best suited for systems requiring moderate sample rate and where area and power consumption are critical. This paper presents a new approach for designing digit-serial/parallel finite field multipliers. This approach combines both array-type and parallel multiplication algorithms, where the digit-level array-type algorithm minimizes the latency for one multiplication operation and the parallel architecture inside of each digit cell reduces both the cycle-time as well as the switching activities, hence power consumption. By appropriately constraining the feasible primitive polynomials, the mod p(x) operation involved in finite field multiplication can be performed in a more efficient way. As a result, the computation delay and energy consumption of one finite field multiplication using the proposed digit-serial/parallel architectures are significantly less than of those obtained by folding the parallel semi-systolic multipliers. Furthermore, their energy-delay products are reduced by a even larger percentage. Therefore, the proposed digit-serial/parallel architectures are attractive for both low-energy and high-performance applications. 相似文献