首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 234 毫秒
1.
Improved Montgomery modular inverse algorithm   总被引:3,自引:0,他引:3  
A new, single and unified Montgomery modular inverse algorithm, which performs both classical and Montgomery modular inversion, is proposed. This reduces the number of Montgomery multiplication operations required by 33% when compared with previous algorithms reported in the literature. The use of this in practice has been investigated by implementation of the improved unified algorithm and the previous algorithms on FPGA devices. The unified algorithm implementation shows a significant speed-up and a reduction in silicon area usage.  相似文献   

2.
 在椭圆曲线密码中,模逆运算是有限域运算中最复杂、最耗时且硬件实现难度最大的运算.本文在Kaliski算法的基础上,提出了基于有符号数字系统的Montgomery模逆算法,它支持素数域和二进制域上任意多精度参数的求模逆运算.据此算法,设计了相应的硬件结构方案,并给出了面积复杂度和时间复杂度分析.仿真结果表明,相比于其它模逆算法硬件设计方案,本文提出的基于有符号数字系统的Montgomery模逆算法在运算速度、电路面积、灵活性等方面具有显著的优越性.  相似文献   

3.
In this paper, a new High-Radix Finite Field multiplication algorithm for GF(2m) is proposed for the first time. The proposed multiplication algorithm can operate in a Digit-serial fashion, and hence can give a trade-off between the speed, the area , the input/output pin limitation, and the low power consumption by simply varying the digit size. A detailed example of a new Radix-16 GF(2m) Digit-Serial multiplication architecture adopting the proposed algorithm illustrates a speed improvement of 75% when compared to conventional Radix-2 bit-serial realization. This is made more significant when it is noted that the speed improvement of 75% was achieved at the expense of only 2.3 times increase in the hardware requirements of the proposed architecture.  相似文献   

4.
A novel hardware architecture for elliptic curve cryptography (ECC) over$ GF(p)$is introduced. This can perform the main prime field arithmetic functions needed in these cryptosystems including modular inversion and multiplication. This is based on a new unified modular inversion algorithm that offers considerable improvement over previous ECC techniques that use Fermat's Little Theorem for this operation. The processor described uses a full-word multiplier which requires much fewer clock cycles than previous methods, while still maintaining a competitive critical path delay. The benefits of the approach have been demonstrated by utilizing these techniques to create a field-programmable gate array (FPGA) design. This can perform a 256-bit prime field scalar point multiplication in 3.86 ms, the fastest FPGA time reported to date. The ECC architecture described can also perform four different types of modular inversion, making it suitable for use in many different ECC applications.  相似文献   

5.
A new bit-parallel systolic multiplier over GF(2m) under the polynomial basis and normal basis is proposed. This new circuit is constructed by m 2 identical cells, each of which consists of one two-input AND gate, one three-input XOR gate and five 1-bit latches. Especially, the proposed architecture is without the basis conversion as compared to the well-known multipliers with the redundant representation. With this proposed multiplier, a parallel-in parallel-out systolic array has also been developed for computing inversion and division over GF(2m). The proposed architectures are well suited to VLSI systems due to their regular interconnection pattern and modular structure.
Che Wun ChiouEmail:
  相似文献   

6.
An architecture based on the RSA public key cryptography algorithm is presented. The circuit includes two components, one for modular squaring and one for modular multiplication. Each component is based on the Montgomery algorithm and implements the modular operations using two modified serial-parallel multipliers. A full modular exponentiation is completed every n(n + 3) clock cycles. All circuits are systolic, operate with 100% efficiency and their maximum combinational delay is equal to one gated Full-Adder. Thus, high-speed performance is achieved while the low cell hardware complexity enables an efficient VLSI implementation.  相似文献   

7.
This contribution focuses on a class of Galois field used to achieve fast finite field arithmetic which we call an Optimal Extension Field (OEF), first introduced in [3]. We extend this work by presenting an adaptation of Itoh and Tsujii's algorithm for finite field inversion applied to OEFs. In particular, we use the facts that the action of the Frobenius map in GF (p m ) can be computed with only m-1 subfield multiplications and that inverses in GF (p) may be computed cheaply using known techniques. As a result, we show that one extension field inversion can be computed with a logarithmic number of extension field multiplications. In addition, we provide new extension field multiplication formulas which give a performance increase. Further, we provide an OEF construction algorithm together with tables of Type I and Type II OEFs along with statistics on the number of pseudo-Mersenne primes and OEFs. We apply this new work to provide implementation results using these methods to construct elliptic curve cryptosystems on both DEC Alpha workstations and Pentium-class PCs. These results show that OEFs when used with our new inversion and multiplication algorithms provide a substantial performance increase over other reported methods. Received 7 July 1999 and revised 29 March 2000 Online publication 15 September 2000  相似文献   

8.
This paper presents two bit-serial modular multipliers based on the linear feedback shift register using an irreducible all one polynomial (AOP) over GF(2m). First, a new multiplication algorithm and its architecture are proposed for the modular AB multiplication. Then a new algorithm and architecture for the modular AB2 multiplication are derived based on the first multiplier. They have significantly smaller hardware complexity than the previous multipliers because of using the property of AOP. It simplifies the modular reduction compared with the case of using the generalized irreducible polynomial. Since the proposed multipliers have low hardware requirements and regular structures, they are suitable for VLSI implementation. The proposed multipliers can be used as the kernel architecture for the operations of exponentiation, inversion, and division.  相似文献   

9.
A new efficient modular division algorithm suitable for systolic implementation and its systolic architecture is proposed in this article. With a new exit condition of while loop and a new updating method of a control variable, the new algorithm reduces the average of iteration numbers by more than 14.3% compared to the algorithm proposed by Chen, Bai and Chen. Based on the new algorithm, we design a fast systolic architecture with an optimised core computing cell. Compared to the architecture proposed by Chen, Bai and Chen, our systolic architecture has reduced the critical path delay by about 18% and the total computational time for one modular division by almost 30%, with the cost of about 1% more cells. Moreover, by the addition of a flag signal and three logic gates, the proposed systolic architecture can also perform Montgomery modular multiplication and a fast unified modular divider/multiplier is realised.  相似文献   

10.
In this paper, a new High-Radix Finite Field multiplication algorithm for GF(2m) is proposed for the first time. The proposed multiplication algorithm can operate in a Digit-serial fashion, and hence can give a trade-off between the speed, the area , the input/output pin limitation, and the low power consumption by simply varying the digit size. A detailed example of a new Radix-16 GF(2m) Digit-Serial multiplication architecture adopting the proposed algorithm illustrates a speed improvement of 75% when compared to conventional Radix-2 bit-serial realization. This is made more significant when it is noted that the speed improvement of 75% was achieved at the expense of only 2.3 times increase in the hardware requirements of the proposed architecture.  相似文献   

11.
This article presents the VLSI design of a configurable RSA public key cryptosystem supporting the 512-bit, 1024-bit and 2048-bit based on Montgomery algorithm achieving comparable clock cycles of current relevant works but with smaller die size. We use binary method for the modular exponentiation and adopt Montgomery algorithm for the modular multiplication to simplify computational complexity, which, together with the systolic array concept for electric circuit designs effectively, lower the die size. The main architecture of the chip consists of four functional blocks, namely input/output modules, registers module, arithmetic module and control module. We applied the concept of systolic array to design the RSA encryption/decryption chip by using VHDL hardware language and verified using the TSMC/CIC 0.35 m 1P4 M technology. The die area of the 2048-bit RSA chip without the DFT is 3.9 × 3.9 mm2 (4.58 × 4.58 mm2 with DFT). Its average baud rate can reach 10.84 kbps under a 100 MHz clock.  相似文献   

12.
We present a new serial-parallel concurrent modular-multiplication algorithm and architecture suitable for standard RSA encryption. In the new scheme, multiplication is performed modulo a multiple of the RSA modulus n, which has a diminished-radix form 2 k -v, where k and v are positive integers and v < n. This design is the first concurrent modular multiplier to use a diminished-radix algorithm and to pipeline concurrent modular-reduction to optimize the clock rate. For a modular multiplier of order ranging from 1 to 10 (number of multiplier bits per clock cycle), a faster clock rate and throughput is possible than with other known designs including those of Brickell, Morita, Sedlak and Golze, and Miyaguchi. Throughput estimates for 512-bit RSA decryption range from 100 kbit/s in a serial mode to 650 kbit/s with a modular multiplier of order 10, at a clock rate of 20 MHz on 1.5 m CMOS.  相似文献   

13.
A VLSI architecture for the generalized bit-flipping decoding algorithm for non-binary low-density parity-check codes is proposed in this paper. The tentative decoding steps of the algorithm have been modified to avoid computing and storing a matrix of dimension N×2 q , for a code (N,K) over GF(2 q ), reducing its complexity with a minimal penalization of its performance, less than 0.05 dB compared with the original algorithm. The architecture was synthesized using a 90 nm standard cell library, for the (837,723) non-binary code over GF(25), requiring 590220 xor gates and achieving a throughput of 89 Mbps. Additionally, it was implemented in a Virtex-VI FPGA device with a cost of 4070 slices and a throughput of 44.6 Mbps.  相似文献   

14.
A new, efficient algorithm for modular inversion in Galois field GF(p) is presented. Very large scale integration (VLSI) implementation of the algorithm is described and the high performance and low silicon penalty is demonstrated  相似文献   

15.
16.
We present a high-speed public-key cryptoprocessor that exploits three-level parallelism in Elliptic Curve Cryptography (ECC) over GF(2 n ). The proposed cryptoprocessor employs a Parallelized Modular Arithmetic Logic Unit (P-MALU) that exploits two types of different parallelism for accelerating modular operations. The sequence of scalar multiplications is also accelerated by exploiting Instruction-Level Parallelism (ILP) and processing multiple P-MALU instructions in parallel. The system is programmable and hence independent of the type of the elliptic curves and scalar multiplication algorithms. The synthesis results show that scalar multiplication of ECC over GF(2163) on a generic curve can be computed in 20 and 16 μs respectively for the binary NAF (Non-Adjacent Form) and the Montgomery method. The performance can be accelerated furthermore on a Koblitz curve and reach scalar multiplication of 12 μs with the TNAF (τ-adic NAF) method. This fast performance allows us to perform over 80,000 scalar multiplications per second and to enhance security in wireless mobile applications.
Ingrid VerbauwhedeEmail:
  相似文献   

17.
RSA密码协处理器的实现   总被引:11,自引:0,他引:11  
李树国  周润德  冯建华  孙义和 《电子学报》2001,29(11):1441-1444
密码协处理器的面积过大和速度较慢制约了公钥密码体制RSA在智能卡中的应用.文中对Montgomery模乘算法进行了分析和改进,提出了一种新的适合于智能卡应用的高基模乘器结构.由于密码协处理器采用两个32位乘法器的并行流水结构,这与心动阵列结构相比它有效地降低了芯片的面积和模乘的时钟数,从而可在智能卡中实现RSA的数字签名与认证.实验表明:在基于0.35μm TSMC标准单元库工艺下,密码协处理器执行一次1024位模乘需1216个时钟周期,芯片设计面积为38k门.在5MHz的时钟频率下,加密1024位的明文平均仅需374ms.该设计与同类设计相比具有最小的模乘运算时钟周期数,并使芯片的面积降低了1/3.这个指标优于当今电子商务的密码协处理器,适合于智能卡应用.  相似文献   

18.
In this paper, the primitive common-multiplicand Montgomery modular multiplication is developed for modular exponentiation. Together with Montgomery powering ladder, a fast, compact and symmetric modular exponentiation architecture is proposed for hardware implementation. The architecture consists of one group of processing elements along the central line and two symmetric groups of accumulation units on two sides. The central elements perform modular reductions, while the symmetric units on both sides accumulate the modular multiplication results. A feedforwarding architecture is employed to decrease the latency between processing elements, in parallel with the word-based accumulation units, which are also pipelined. Meanwhile, due to the symmetric architecture and Montgomery powering ladder, the modular exponentiation is immune from fault and simple power attacks. Implemented in FPGA platform, the performance of our proposed design outperforms most results so far in the literature.  相似文献   

19.
An implementation for a fast public-key cryptosystem   总被引:9,自引:0,他引:9  
In this paper we examine the development of a high-speed implementation of a system to perform exponentiation in fields of the form GF(2 n ). For sufficiently large n, this device has applications in public-key cryptography. The selection of representation and observations on the structure of multiplication have led to the development of an architecture which is of low complexity and high speed. A VLSI implementation has being fabricated with measured throughput for exponentiation for cryptographic purposes of approximately 300 kilobits per second.  相似文献   

20.
Modular exponentiation is the cornerstone computation in public-key cryptography systems such as RSA cryptosystems. The operation is time consuming for large operands. This paper describes the characteristics of three architectures designed to implement modular exponentiation using the fast binary method: the first field-programmable gate array (FPGA) prototype has a sequential architecture, the second has a parallel architecture, and the third has a systolic array-based architecture. The paper compares the three prototypes as well as Blum and Paar's implementation using the time /spl times/ area classic factor. All three prototypes implement the modular multiplication using the popular Montgomery algorithm.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号