首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
高性能可扩展公钥密码协处理器研究与设计   总被引:1,自引:0,他引:1       下载免费PDF全文
黎明  吴丹  戴葵  邹雪城 《电子学报》2011,39(3):665-670
 本文提出了一种高效的点乘调度策略和改进的双域高基Montgomery模乘算法,在此基础上设计了一种新型高性能可扩展公钥密码协处理器体系结构,并采用0.18μm 1P6M标准CMOS工艺实现了该协处理器,以支持RSA和ECC等公钥密码算法的计算加速.该协处理器通过扩展片上高速存储器和使用以基数为处理字长的方法,具有良好的可扩展性和较强的灵活性,支持2048位以内任意大数模幂运算以及576位以内双域任意椭圆曲线标量乘法运算.芯片测试结果表明其具有很好的加速性能,完成一次1024位模幂运算仅需197μs、GF(p)域192位标量乘法运算仅需225μs、GF(2m)域163位标量乘法运算仅需200.7μs.  相似文献   

2.
This paper presents a new modular multiplication algorithm that allows one to implement modular multiplications efficiently. It proposes a systematic approach for maximizing a level of parallelism when performing a modular multiplication. The proposed algorithm effectively integrates three different existing algorithms, a classical modular multiplication based on Barrett reduction, the modular multiplication with Montgomery reduction and the Karatsuba multiplication algorithms in order to reduce the computational complexity and increase the potential of parallel processing. The algorithm is suitable for both hardware implementations and software implementations in a multiprocessor environment. To show the effectiveness of the proposed algorithm, we implement several hardware modular multipliers and compare the area and performance results. We show that a modular multiplier using the proposed algorithm achieves a higher speed comparing to the modular multipliers based on the previously proposed algorithms.  相似文献   

3.
Modular exponentiation in public-key cryptosystems is usually achieved by repeated modular multiplications on large integers. Designing high-speed modular multiplication is thus very crucial to speed up the decryption/encryption process. In this paper, we first explore how to relax the data dependency that exists between multiplication, quotient determination, and modular reduction in the conventional Montgomery modular multiplication algorithm. Then, we propose a new modular multiplication algorithm for high-speed hardware design. The speed improvement is achieved by reducing the critical path delay from the 4-to-2 to 3-to-2 carry-save addition. The resulting time complexity of our development is further decreased by simultaneously performing the multiplication and modular reduction processes. Experimental results show that the developed modular multiplication can operate at speeds higher than those of related work. When the proposed modular multiplication is applied to modular exponentiation, both time and area-time advantages are obtained.  相似文献   

4.
In this paper, the primitive common-multiplicand Montgomery modular multiplication is developed for modular exponentiation. Together with Montgomery powering ladder, a fast, compact and symmetric modular exponentiation architecture is proposed for hardware implementation. The architecture consists of one group of processing elements along the central line and two symmetric groups of accumulation units on two sides. The central elements perform modular reductions, while the symmetric units on both sides accumulate the modular multiplication results. A feedforwarding architecture is employed to decrease the latency between processing elements, in parallel with the word-based accumulation units, which are also pipelined. Meanwhile, due to the symmetric architecture and Montgomery powering ladder, the modular exponentiation is immune from fault and simple power attacks. Implemented in FPGA platform, the performance of our proposed design outperforms most results so far in the literature.  相似文献   

5.
Improved Montgomery modular inverse algorithm   总被引:3,自引:0,他引:3  
A new, single and unified Montgomery modular inverse algorithm, which performs both classical and Montgomery modular inversion, is proposed. This reduces the number of Montgomery multiplication operations required by 33% when compared with previous algorithms reported in the literature. The use of this in practice has been investigated by implementation of the improved unified algorithm and the previous algorithms on FPGA devices. The unified algorithm implementation shows a significant speed-up and a reduction in silicon area usage.  相似文献   

6.
A new efficient modular division algorithm suitable for systolic implementation and its systolic architecture is proposed in this article. With a new exit condition of while loop and a new updating method of a control variable, the new algorithm reduces the average of iteration numbers by more than 14.3% compared to the algorithm proposed by Chen, Bai and Chen. Based on the new algorithm, we design a fast systolic architecture with an optimised core computing cell. Compared to the architecture proposed by Chen, Bai and Chen, our systolic architecture has reduced the critical path delay by about 18% and the total computational time for one modular division by almost 30%, with the cost of about 1% more cells. Moreover, by the addition of a flag signal and three logic gates, the proposed systolic architecture can also perform Montgomery modular multiplication and a fast unified modular divider/multiplier is realised.  相似文献   

7.
经典 Montgomery 阶梯算法是提高椭圆曲线加密运算效率的有效方法之一。首先利用循环展开技术,提出了一种改进的 Montgomery 阶梯算法。然后根据 Montgomery 椭圆曲线加密算法的特点,在其读入数据环节采取数据并行方式进行处理;在其模幂运算环节采取任务并行方式进行处理。仿真实验结果表明,采用数据并行和任务并行2种方式,可有效提升椭圆曲线加密运算的效率。  相似文献   

8.
分析了Montgomery算法,指出用改进的预计算Montgomery算法实现模幂运算的过程,分析并比较了两种实现模采和模幂乘算法。并分别用C^ 和Modeleim进行仿真,得出仿真测试结果。  相似文献   

9.
Modular inverse arithmetic plays an important role in elliptic curve cryptography. Based on the analysis of Montgomery modular inversion algorithm, this paper presents a new dual-field modular inversion algorithm, and a novel scalable and unified architecture for Montgomery inverse hardware in finite fields GF(p) and GF(2 n ) is proposed. Furthermore, this architecture based on the new modular inversion algorithm has been verified by modeling it in Verilog-HDL, and accomplished it under 0.18 μm CMOS technology. The result indicates that our work has better performance and flexibility than other works.  相似文献   

10.
以RSA算法为例,探讨了公钥密码处理芯片的设计与实现.首先提出了公钥密码芯片实现中的核心问题,即大整数模幂运算算法和大整数模乘运算算法的实现;然后针对RSA算法,提出了Montgomery模乘算法的CIOS方法的一种新的快速硬件并行实现方法,其中采用了加法与乘法并行运算以及多级流水线技术以提高性能,较大地减少了乘法运算时间,提高了模乘器的性能.  相似文献   

11.
An improved systolic array for the Montgomery modular multiplication algorithm is presented. The recursive equation proposed by Walter [1993] is explicitly transformed into Boolean algebraic equations. By analysing and optimising these equations an improved systolic array can be made 20% faster than that proposed by Walter  相似文献   

12.
This paper focuses on the design and implementation of a fast reconfigurable method for elliptic curve cryptography acceleration in GF(2 m ). The main contribution of this paper is comparing different reconfigurable modular multiplication methods and modular reduction methods for software implementation on Intel IA-32 processors, optimizing point arithmetic to reduce the number of expensive reduction operations through a novel reduction sharing technique, and measuring performance for scalar point multiplication in GF(2 m ) on Intel IA-32 processors. This paper determined that systematic reduction is best for fields defined with trinomials or pentanomials; however, for fields defined with reduction polynomials with large Hamming weight Barrett reduction is best. In GF(2571) for Intel P4 2.8 GHz processor, long multiplication with systematic reduction was 2.18 and 2.26 times faster than long multiplication with Barrett or Montgomery reduction. This paper determined that Montgomery Invariant scalar point multiplication with Systematic reduction in Projective coordinates was the fastest method for single scalar point multiplication for the NIST fields from GF(2163) to GF(2571). For single scalar point multiplication on a reconfigurable elliptic curve cryptography accelerator, we were able to achieve ∼6.1 times speedup using reconfigurable reduction methods with long multiplication, Montgomery’s MSB Invariant method in projective coordinates, and systematic reduction. Further extensions were made to implement fast reconfigurable elliptic curve cryptography for repeated scalar point multiplication on the same base point. We also show that for L > 20 the LSB invariant method combined with affine doubling precomputation outperforms the LSB invariant method combined with López-Dahab doubling precomputation for all reconfigurable reduction polynomial techniques in GF(2571) for Intel IA-32 processors. For L = 1000, the LSB invariant scalar point multiplication method was 13.78 to 34.32% faster than using the fastest Montgomery Invariant scalar point multiplication method on Intel IA-32 processors.  相似文献   

13.
We propose an efficient multi-exponentiation algorithm based on the modified Booth' algorithm and Montgomery's modular multiplication algorithm. The multi-exponentiation algorithm can be used to implement fast modern cryptosystems. Owing to the reduced number of multiplications, this algorithm is about 10% faster than Pekmestzi's algorithms. The proposed algorithm can be implemented in hardware as a small component. The component can then be used to form an efficient modular multi-exponentiation module by combining it with an efficient Montgomery modular multiplication module.  相似文献   

14.
基于RSA系统的Montgomery算法的改进设计   总被引:5,自引:0,他引:5  
针对Montgomery算法中模乘模块的CIOS模式提出了一种改进算法。该算法模式比原CIOS模式节省了近一半的操作次数,并且给出了一种优化的硬件实现结构。在保证系统规模较小的基础上采用了两个相同的数据通路以加速运算速度,同时采用了移位寄存器结构进一步简化时序控制的复杂性。此改进算法适用于各种公钥体制的加解密处理器。  相似文献   

15.
在公钥密码实现中,Montgomery模乘扮演着非常重要的角色。本文研究Montgomery模乘(MMM)的迭代控制结构,给出了进行MMM迭代的输入边界控制条件,以及改进的MMM算法。这种扩展的迭代控制条件适合用于复杂求幂的迭代过程,在其边界控制下可直接进行一些加法、减法及乘法等基本运算,而无须模约化处理。给出的模乘迭代算法具有高度的灵活性,可利用来实现安全高效的RSA、ECC等公钥密码体制。  相似文献   

16.
一款RSA模乘幂运算器的设计与实现   总被引:4,自引:1,他引:3  
刘强  佟冬  程旭 《电子学报》2005,33(5):923-927
通讯技术的高速发展需要更高性能的密码处理设备.本文介绍的RSA模乘幂运算器,采用蒙哥马利模乘法算法和指数的从右到左的二进制方法,并根据大整数模乘法运算和VLSI实现的要求进行改进,提供高速RSA模乘幂运算能力.该RSA运算器在其模乘法器中使用了进位保留加法器结构以避免长进位链.我们提出了信号多重备份的方法,解决大整数运算结构中关键信号广播带来的负载问题.  相似文献   

17.
Modular exponentiation is the cornerstone computation in public-key cryptography systems such as RSA cryptosystems. The operation is time consuming for large operands. This paper describes the characteristics of three architectures designed to implement modular exponentiation using the fast binary method: the first field-programmable gate array (FPGA) prototype has a sequential architecture, the second has a parallel architecture, and the third has a systolic array-based architecture. The paper compares the three prototypes as well as Blum and Paar's implementation using the time /spl times/ area classic factor. All three prototypes implement the modular multiplication using the popular Montgomery algorithm.  相似文献   

18.
In this paper an improved Montgomery multiplier, based on modified four-to-two carry-save adders (CSAs) to reduce critical path delay, is presented. Instead of implementing four-to-two CSA using two levels of carry-save logic, authors propose a modified four-to-two CSA using only one level of carry-save logic taking advantage of pre-computed input values. Also, a new bit-sliced, unified and scalable Montgomery multiplier architecture, applicable for both RSA and ECC (Elliptic Curve Cryptography), is proposed. In the existing word-based scalable multiplier architectures, some processing elements (PEs) do not perform useful computation during the last pipeline cycle when the precision is not equal to an exact multiple of the word size, like in ECC. This intrinsic limitation requires a few extra clock cycles to operate on operand lengths which are not powers of 2. The proposed architecture eliminates the need for extra clock cycles by reconfiguring the design at bit-level and hence can operate on any operand length, limited only by memory and control constraints. It requires 2∼15% fewer clock cycles than the existing architectures for key lengths of interest in RSA and 11∼18% for binary fields and 10∼14% for prime fields in case of ECC. An FPGA implementation of the proposed architecture shows that it can perform 1,024-bit modular exponentiation in about 15 ms which is better than that by the existing multiplier architectures.
M. B. SrinivasEmail:
  相似文献   

19.
This article presents the VLSI design of a configurable RSA public key cryptosystem supporting the 512-bit, 1024-bit and 2048-bit based on Montgomery algorithm achieving comparable clock cycles of current relevant works but with smaller die size. We use binary method for the modular exponentiation and adopt Montgomery algorithm for the modular multiplication to simplify computational complexity, which, together with the systolic array concept for electric circuit designs effectively, lower the die size. The main architecture of the chip consists of four functional blocks, namely input/output modules, registers module, arithmetic module and control module. We applied the concept of systolic array to design the RSA encryption/decryption chip by using VHDL hardware language and verified using the TSMC/CIC 0.35 m 1P4 M technology. The die area of the 2048-bit RSA chip without the DFT is 3.9 × 3.9 mm2 (4.58 × 4.58 mm2 with DFT). Its average baud rate can reach 10.84 kbps under a 100 MHz clock.  相似文献   

20.
RSA密码协处理器的实现   总被引:11,自引:0,他引:11  
李树国  周润德  冯建华  孙义和 《电子学报》2001,29(11):1441-1444
密码协处理器的面积过大和速度较慢制约了公钥密码体制RSA在智能卡中的应用.文中对Montgomery模乘算法进行了分析和改进,提出了一种新的适合于智能卡应用的高基模乘器结构.由于密码协处理器采用两个32位乘法器的并行流水结构,这与心动阵列结构相比它有效地降低了芯片的面积和模乘的时钟数,从而可在智能卡中实现RSA的数字签名与认证.实验表明:在基于0.35μm TSMC标准单元库工艺下,密码协处理器执行一次1024位模乘需1216个时钟周期,芯片设计面积为38k门.在5MHz的时钟频率下,加密1024位的明文平均仅需374ms.该设计与同类设计相比具有最小的模乘运算时钟周期数,并使芯片的面积降低了1/3.这个指标优于当今电子商务的密码协处理器,适合于智能卡应用.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号