期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

An improved Montgomery's algorithm for high-speed RSA public-keycryptosystem

Chih-Yuang Su Shih-Am Hwang Po-Song Chen Cheng-Wen Wu 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》1999,7(2):280-284

We revise Montgomery's algorithm such that modular multiplication can be executed two times faster. Each iteration in our algorithm requires only one addition, while that in Montgomery's requires two additions. We then propose a cellular array to implement modular exponentiation for the Rivest-Shamir-Adleman cryptosystem. It has approximately 2n cells, where n is the word length. The cell contains one full-adder and some controlling logic. The time to calculate a modular exponentiation is about 2n² clock cycles. The proposed architecture has a data rate of 100 kb/s for 512-b words and a 100 MHz clock 相似文献

2.

RSA密码协处理器的实现 总被引：11，自引：0，他引：11

李树国周润德冯建华孙义和《电子学报》2001,29(11):1441-1444

密码协处理器的面积过大和速度较慢制约了公钥密码体制RSA在智能卡中的应用.文中对Montgomery模乘算法进行了分析和改进,提出了一种新的适合于智能卡应用的高基模乘器结构.由于密码协处理器采用两个32位乘法器的并行流水结构,这与心动阵列结构相比它有效地降低了芯片的面积和模乘的时钟数,从而可在智能卡中实现RSA的数字签名与认证.实验表明:在基于0.35μm TSMC标准单元库工艺下,密码协处理器执行一次1024位模乘需1216个时钟周期,芯片设计面积为38k门.在5MHz的时钟频率下,加密1024位的明文平均仅需374ms.该设计与同类设计相比具有最小的模乘运算时钟周期数,并使芯片的面积降低了1/3.这个指标优于当今电子商务的密码协处理器,适合于智能卡应用. 相似文献

3.

VLSI design of an RSA encryption/decryption chip using systolic array based architecture

Chi-Chia Sun Bor-Shing Lin Gene Eu Jan Jheng-Yi Lin 《International Journal of Electronics》2016,103(9):1538-1549

This article presents the VLSI design of a configurable RSA public key cryptosystem supporting the 512-bit, 1024-bit and 2048-bit based on Montgomery algorithm achieving comparable clock cycles of current relevant works but with smaller die size. We use binary method for the modular exponentiation and adopt Montgomery algorithm for the modular multiplication to simplify computational complexity, which, together with the systolic array concept for electric circuit designs effectively, lower the die size. The main architecture of the chip consists of four functional blocks, namely input/output modules, registers module, arithmetic module and control module. We applied the concept of systolic array to design the RSA encryption/decryption chip by using VHDL hardware language and verified using the TSMC/CIC 0.35 m 1P4 M technology. The die area of the 2048-bit RSA chip without the DFT is 3.9 × 3.9 mm² (4.58 × 4.58 mm² with DFT). Its average baud rate can reach 10.84 kbps under a 100 MHz clock. 相似文献

4.

一种大数模幂的硬件实现设计

王晓林周玉洁《信息技术》2005,29(10):41-44

提出了一种实现大数模幂的硬件设计方法。其中的大数模乘部分基于基2的Montgomery改进算法，采用模乘心动阵列结构，提出了一种双边沿触发串行计算的新结构，节约了面积，同时可以达到较高的时钟频率。模幂部分基于M-ary算法，减少了所需模乘运算的次数。并比较了这种实现方法与常见的L-R二进制幂算法的实现方式速度上的改进。相似文献

5.

Two systolic architectures for modular multiplication

Wei-Chang Tsai Shung C.B. Sheng-Jyh Wang 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2000,8(1):103-107

The authors present two systolic architectures to speed up the computation of modular multiplication in RSA cryptosystems. In the double-layer architecture, the main operation of Montgomery's algorithm is partitioned into two parallel operations after using the precomputation of the quotient bit. In the non-interlaced architecture, we eliminate the one-clock-cycle gap between iterations by pairing off the double-layer architecture. We compare our architectures with some previously proposed Montgomery-based systolic architectures, on the basis of both modular multiplication and modular exponentiation. The comparisons indicate that our architectures offer the highest speed, lower hardware complexity, and lower power consumption 相似文献

6.

基于Montgomery算法安全漏洞的SPA攻击算法

甘刚王敏杜之波吴震《通信学报》2013,34(Z1):20-161

公钥密码体制的算法大多基于有限域的幂指数运算或者离散对数运算。而这些运算一般会采用Montgomery算法来降低运算的复杂度。针对Montgomery算法本身存在可被侧信道攻击利用的信息泄露问题,从理论和实际功耗数据2方面分析了Montgomery算法存在的安全漏洞,并基于该漏洞提出了对使用Montgomery算法实现的模幂运算进行简单能量分析（SPA, simple power analysis）攻击算法。利用该算法对实际模幂运算的能量曲线进行了功耗分析攻击。实验表明该攻击算法是行之有效的。相似文献

7.

Montgomery算法的研究与实现

张宪王喜成《现代电子技术》2004,27(8):85-86

分析了Montgomery算法，指出用改进的预计算Montgomery算法实现模幂运算的过程，分析并比较了两种实现模采和模幂乘算法。并分别用C^ 和Modeleim进行仿真，得出仿真测试结果。相似文献

8.

A design of a fast pipelined modular multiplier based on a diminished-radix algorithm

Glenn Orton Lloyd Peppard Stafford Tavares 《Journal of Cryptology》1993,6(4):183-208

We present a new serial-parallel concurrent modular-multiplication algorithm and architecture suitable for standard RSA encryption. In the new scheme, multiplication is performed modulo a multiple of the RSA modulus n, which has a diminished-radix form 2^k-v, where k and v are positive integers and v < n. This design is the first concurrent modular multiplier to use a diminished-radix algorithm and to pipeline concurrent modular-reduction to optimize the clock rate. For a modular multiplier of order ranging from 1 to 10 (number of multiplier bits per clock cycle), a faster clock rate and throughput is possible than with other known designs including those of Brickell, Morita, Sedlak and Golze, and Miyaguchi. Throughput estimates for 512-bit RSA decryption range from 100 kbit/s in a serial mode to 650 kbit/s with a modular multiplier of order 10, at a clock rate of 20 MHz on 1.5 m CMOS. 相似文献

9.

A Fully-Pipeline Linear Systolic Architecture for Modular Multiplier in Public-Key Crypto-Systems

Xingjun Wu Hongyi Chen Yihe Sun Weixin Gai 《The Journal of VLSI Signal Processing》2003,33(1-2):191-197

In this paper, a fully-pipeline linear systolic array based on adjusted Montgomery's algorithm is presented to perform modular multiplication at extremely high speed. The processing element (PE) consists of only 4 full-adders and 14 flip-flops. Three-stage internal pipelined PE results in a very short critical path with only a one-bit full-adder delay. Thus, it can run at a very high cycle rate. The total execution time for an n-bit modular multiplication is 2n + 11 cycles with only (n/2 + 2) PEs. A modular exponentiation based on it takes (3n + 16.5)n cycles in average. Compared with most published VLSI modular multipliers, the hardware complexity is greatly reduced while keeping very high throughput. Therefore it is a good candidate of the arithmetic units used in the many public-key crypto-systems, e.g. RSA, Elliptic Curve and so on, especially for the embedded applications concerning information security. 相似文献

10.

一种基于CSA加法器的Montgomery模幂乘硬件实现算法

桂宇光李林森《信息技术》2005,29(11):24-27

提出了一种改进的Montgomery模乘和模幂算法,该算法采用5-to-2 CSA加法器来实现Montgomery模乘算法中的超长大数加法。目前使用CSA加法器的其他模乘算法在模乘结果输出时均需要用CPA加法器来处理CSA加法器的输出结果,而本文提出的算法使得模乘运算的输入输出操作数均可采用保留进位形式,避免了进行超长操作数的CPA加法这一耗时的操作,因此显著减少了模乘运算所需时钟周期,提高了数据处理的时间效率,并加快了RSA模幂运算的速度。相似文献

11.

一种用于ECC密码体制的模乘器设计

毛天然李树国《微电子学》2006,36(3):344-346,351

提出了一种基于Montgomery算法的模乘器。与现有结构相比,由于采用了多级流水线的乘法器结构,提高了系统的时钟频率;并通过引入预计算单元,解决了流水线停顿的问题,提高了系统的并行性,减少了所需的时钟数。该模乘器位长233位,基于SMIC 0.18μm最坏工艺的综合结果表明,电路的关键路径最大时延为3.8 ns,芯片面积2 mm2。一次模乘计算只需要108个时钟周期,适合ECC密码体制的应用要求。相似文献

12.

A scalable hybrid modular multiplication algorithm

Meng Qiang Chen Tao Dai Zibin Chen Quji 《电子科学学刊(英文版)》2008,25(3):378-383

Based on the analysis of several familiar large integer modular multiplication algorithms, this paper proposes a new Scalable Hybrid modular multiplication （SHyb） algorithm which has scalable operands, and presents an RSA algorithm model with scalable key size. Theoretical analysis shows that SHyb algorithm requires m^2n/2 ＋ 2m iterations to complete an mn-bit modular multiplication with the application of an n-bit modular addition hardware circuit. The number of the required iterations can be reduced to a half of that of the scalable Montgomery algorithm. Consequently, the application scope of the RSA cryptosystem is expanded and its operation speed is enhanced based on SHyb algorithm. 相似文献

13.

一种用于公钥密码系统的新型可变Radix快速乘法硬件算法

盖伟新《电子学报》1995,23(11):77-80

本文提出了一种新型的可变ｒａｄｉｘ快速乘法硬件算法，算法中，采用了二进制数的冗余数表示方法，使二个大数（大到５１２ｂｉｔ位或更大）的相加在Ｏ（１）时间内完成而无需等待进位；其次，提出了可变ｒａｄｉｘ快速乘法思想，使算法比ｒａｄｉｘ－４的乘法算法速度提高３３％，比ｒａｄｉｘ－８的乘法算法速度提高１１％而硬件实现更为简单，算法还能克服在较坏和最坏条件下，ｒａｄｉｘ－８乘法算法速度严重下降的缺陷，是一种可以作为核心运算有效地使用在许多公钥密码体制（如ＲＳＡ）硬件ＶＬＳＩ实现中的新型快速算法。相似文献

14.

A Systolic, High Speed Architecture for an RSA Cryptosystem

K.Z. Pekmestzi N.K. Moshopoulos 《The Journal of VLSI Signal Processing》2002,32(3):223-235

An architecture based on the RSA public key cryptography algorithm is presented. The circuit includes two components, one for modular squaring and one for modular multiplication. Each component is based on the Montgomery algorithm and implements the modular operations using two modified serial-parallel multipliers. A full modular exponentiation is completed every n(n + 3) clock cycles. All circuits are systolic, operate with 100% efficiency and their maximum combinational delay is equal to one gated Full-Adder. Thus, high-speed performance is achieved while the low cell hardware complexity enables an efficient VLSI implementation. 相似文献

15.

FPGA implementation of RSA public-key cryptographic coprocessor based on systolic linear array architecture

Wen Nuan Dai Zibin Zhang Yongfu 《电子科学学刊(英文版)》2006,23(5):718-722

In order to make the typical Montgomery＇s algorithm suitable for implementation on FPGA, a modified version is proposed and then a high-performance systolic linear array architecture is designed for RSA cryptosystem on the basis of the optimized algorithm. The proposed systolic array architecture has distinctive features, i,e, not only the computation speed is significantly fast but also the hardware overhead is drastically decreased. As a major practical result, the paper shows that it is possible to implement public-key cryptosystem at secure bit lengths on a single commercially available FPGA. 相似文献

16.

Three hardware architectures for the binary modular exponentiation: sequential, parallel, and systolic

Nedjah N. Mourelle Ld.M. 《IEEE transactions on circuits and systems. I, Regular papers》2006,53(3):627-633

Modular exponentiation is the cornerstone computation in public-key cryptography systems such as RSA cryptosystems. The operation is time consuming for large operands. This paper describes the characteristics of three architectures designed to implement modular exponentiation using the fast binary method: the first field-programmable gate array (FPGA) prototype has a sequential architecture, the second has a parallel architecture, and the third has a systolic array-based architecture. The paper compares the three prototypes as well as Blum and Paar's implementation using the time /spl times/ area classic factor. All three prototypes implement the modular multiplication using the popular Montgomery algorithm. 相似文献

17.

Recovering lost efficiency of exponentiation algorithms on smartcards

《Electronics letters》2002,38(19):1095-1097

At the RSA cryptosystem implementation stage, a major security concern is resistance against so-called side-channel attacks. Solutions are known but they increase the overall complexity by a non-negligible factor (typically, a protected RSA exponentiation is 133% slower). For the first time, protected solutions are proposed that do not penalise the running time of an exponentiation 相似文献

18.

New and Improved Architectures for Montgomery Modular Multiplication

M. Sudhakar R. V. Kamala M. B. Srinivas 《Mobile Networks and Applications》2007,12(4):281-291

In this paper an improved Montgomery multiplier, based on modified four-to-two carry-save adders (CSAs) to reduce critical path delay, is presented. Instead of implementing four-to-two CSA using two levels of carry-save logic, authors propose a modified four-to-two CSA using only one level of carry-save logic taking advantage of pre-computed input values. Also, a new bit-sliced, unified and scalable Montgomery multiplier architecture, applicable for both RSA and ECC (Elliptic Curve Cryptography), is proposed. In the existing word-based scalable multiplier architectures, some processing elements (PEs) do not perform useful computation during the last pipeline cycle when the precision is not equal to an exact multiple of the word size, like in ECC. This intrinsic limitation requires a few extra clock cycles to operate on operand lengths which are not powers of 2. The proposed architecture eliminates the need for extra clock cycles by reconfiguring the design at bit-level and hence can operate on any operand length, limited only by memory and control constraints. It requires 2∼15% fewer clock cycles than the existing architectures for key lengths of interest in RSA and 11∼18% for binary fields and 10∼14% for prime fields in case of ECC. An FPGA implementation of the proposed architecture shows that it can perform 1,024-bit modular exponentiation in about 15 ms which is better than that by the existing multiplier architectures.

M. B. SrinivasEmail:

相似文献

19.

侧信道原子化的严格自随机化模幂算法

李志远白雪飞郭立《微电子学与计算机》2010,27(2)

研究了RSA密码算法的差分功耗分析防御方法.通过对自随机化模幂算法的分析,提出将BBS随机数发生器和侧信道原子化技术应用于改进的算法中,得到侧信道原子化的严格自随机化模幂算法.仿真实验结果证明.该方法可以有效防御差分功耗分析攻击. 相似文献

20.

A New Algorithm for High-Speed Modular Multiplication Design

Ming-Der Shieh Jun-Hong Chen Wen-Ching Lin Hao-Hsuan Wu 《IEEE transactions on circuits and systems. I, Regular papers》2009,56(9):2009-2019

Modular exponentiation in public-key cryptosystems is usually achieved by repeated modular multiplications on large integers. Designing high-speed modular multiplication is thus very crucial to speed up the decryption/encryption process. In this paper, we first explore how to relax the data dependency that exists between multiplication, quotient determination, and modular reduction in the conventional Montgomery modular multiplication algorithm. Then, we propose a new modular multiplication algorithm for high-speed hardware design. The speed improvement is achieved by reducing the critical path delay from the 4-to-2 to 3-to-2 carry-save addition. The resulting time complexity of our development is further decreased by simultaneously performing the multiplication and modular reduction processes. Experimental results show that the developed modular multiplication can operate at speeds higher than those of related work. When the proposed modular multiplication is applied to modular exponentiation, both time and area-time advantages are obtained. 相似文献