期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

On Parallelization of High-Speed Processors for Elliptic Curve Cryptography

《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2008,16(9):1162-1175

This paper discusses parallelization of elliptic curve cryptography hardware accelerators using elliptic curves over binary fields $BBF_{2^{m}}$. Elliptic curve point multiplication, which is the operation used in every elliptic curve cryptosystem, is hierarchical in nature, and parallelism can be utilized in different hierarchy levels as shown in many publications. However, a comprehensive analysis on the effects of parallelization has not been previously presented. This paper provides tools for evaluating the use of parallelism and shows where it should be used in order to maximize efficiency. Special attention is given for a family of curves called Koblitz curves because they offer very efficient point multiplication. A new method where the latency of point multiplication is reduced with parallel field arithmetic processors is introduced. It is shown to outperform the previously presented multiple field multiplier techniques in the cases of Koblitz curves and generic curves with fixed base points. A highly efficient general elliptic curve cryptography processor architecture is presented and analyzed. Based on this architecture and analysis on the effects of parallelization, a few designs are implemented on an Altera Stratix II field-programmable gate array (FPGA). 相似文献

2.

Fast Reconfigurable Elliptic Curve Cryptography Acceleration for <Emphasis Type="Italic">GF</Emphasis>(2<Superscript><Emphasis Type="Italic">m</Emphasis></Superscript>) on 32 bit Processors

Aaron E. Cohen Keshab K. Parhi 《Journal of Signal Processing Systems》2010,60(1):31-45

This paper focuses on the design and implementation of a fast reconfigurable method for elliptic curve cryptography acceleration in GF(2^m). The main contribution of this paper is comparing different reconfigurable modular multiplication methods and modular reduction methods for software implementation on Intel IA-32 processors, optimizing point arithmetic to reduce the number of expensive reduction operations through a novel reduction sharing technique, and measuring performance for scalar point multiplication in GF(2^m) on Intel IA-32 processors. This paper determined that systematic reduction is best for fields defined with trinomials or pentanomials; however, for fields defined with reduction polynomials with large Hamming weight Barrett reduction is best. In GF(2⁵⁷¹) for Intel P4 2.8 GHz processor, long multiplication with systematic reduction was 2.18 and 2.26 times faster than long multiplication with Barrett or Montgomery reduction. This paper determined that Montgomery Invariant scalar point multiplication with Systematic reduction in Projective coordinates was the fastest method for single scalar point multiplication for the NIST fields from GF(2¹⁶³) to GF(2⁵⁷¹). For single scalar point multiplication on a reconfigurable elliptic curve cryptography accelerator, we were able to achieve ∼6.1 times speedup using reconfigurable reduction methods with long multiplication, Montgomery’s MSB Invariant method in projective coordinates, and systematic reduction. Further extensions were made to implement fast reconfigurable elliptic curve cryptography for repeated scalar point multiplication on the same base point. We also show that for L > 20 the LSB invariant method combined with affine doubling precomputation outperforms the LSB invariant method combined with López-Dahab doubling precomputation for all reconfigurable reduction polynomial techniques in GF(2⁵⁷¹) for Intel IA-32 processors. For L = 1000, the LSB invariant scalar point multiplication method was 13.78 to 34.32% faster than using the fastest Montgomery Invariant scalar point multiplication method on Intel IA-32 processors. 相似文献

3.

Hardware Elliptic Curve Cryptographic Processor Over$rm GF(p)$

《IEEE transactions on circuits and systems. I, Regular papers》2006,53(9):1946-1957

A novel hardware architecture for elliptic curve cryptography (ECC) over$ GF(p)$is introduced. This can perform the main prime field arithmetic functions needed in these cryptosystems including modular inversion and multiplication. This is based on a new unified modular inversion algorithm that offers considerable improvement over previous ECC techniques that use Fermat's Little Theorem for this operation. The processor described uses a full-word multiplier which requires much fewer clock cycles than previous methods, while still maintaining a competitive critical path delay. The benefits of the approach have been demonstrated by utilizing these techniques to create a field-programmable gate array (FPGA) design. This can perform a 256-bit prime field scalar point multiplication in 3.86 ms, the fastest FPGA time reported to date. The ECC architecture described can also perform four different types of modular inversion, making it suitable for use in many different ECC applications. 相似文献

4.

Customizable elliptic curve cryptosystems

Cheung R.C.C. Telle N.J. Luk W. Cheung P.Y.K. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2005,13(9):1048-1059

This paper presents a method for producing hardware designs for elliptic curve cryptography (ECC) systems over the finite field GF(2/sup m/), using the optimal normal basis for the representation of numbers. Our field multiplier design is based on a parallel architecture containing multiple m-bit serial multipliers; by changing the number of such serial multipliers, designers can obtain implementations with different tradeoffs in speed, size and level of security. A design generator has been developed which can automatically produce a customised ECC hardware design that meets user-defined requirements. To facilitate performance characterization, we have developed a parametric model for estimating the number of cycles for our generic ECC architecture. The resulting hardware implementations are among the fastest reported: for a key size of 270 bits, a point multiplication in a Xilinx XC2V6000 FPGA at 35 MHz can run over 1000 times faster than a software implementation on a Xeon computer at 2.6 GHz. 相似文献

5.

Efficient FPGA Implementation of a Programmable Architecture for GF(p) Elliptic Curve Crypto Computations

Lo’ai Ali Tawalbeh Abidalrahman Mohammad Adnan Abdul-Aziz Gutub 《Journal of Signal Processing Systems》2010,59(3):233-244

This paper presents a processor architecture for elliptic curve cryptography computations over GF(p). The speed to compute the Elliptic-curve point multiplication over the prime fields GF(p) is increased by using the maximum degree of parallelism, and by carefully selecting the most appropriate coordinates system. The proposed Elliptic Curve processor is implemented using FPGAs. The time, area and throughput results are obtained, analyzed, and compared with previously proposed designs showing interesting performance and features. 相似文献

6.

一种基于椭圆曲线的流水线实现方法 总被引：2，自引：2，他引：0

张霄鹏李树国《微电子学与计算机》2008,25(7)

提出了一种基于椭圆曲线的流水线实现方法,来解决串行计算的效率低下问题.通过分析椭圆曲线密码运算的数据相关性,在不增加模乘器面积的前提下,采用三级流水线,提高了椭圆曲线密码的运算速度,并给出适用于椭圆曲线密码VLSI设计的流水线的实现流程. 相似文献

7.

基于二进制Edwards曲线的椭圆曲线加密多标量乘结构设计与实现

苏贺鹏张盛兵《微电子学与计算机》2011,28(2):98-102

标量乘及多标量乘算法是影响椭圆曲线密码系统性能的关键.基于二进制Edwards曲线提出并实现了一种新型的椭圆曲线标量乘法器.由于Edwards曲线的完备性,这种乘法器可对曲线上任意一点进行计算,而不用区分倍乘或者负元,实现较简单,有很高的运算速度和很强的抗侧信道攻击的能力. 相似文献

8.

RSA密码协处理器的实现 总被引：11，自引：0，他引：11

李树国周润德冯建华孙义和《电子学报》2001,29(11):1441-1444

密码协处理器的面积过大和速度较慢制约了公钥密码体制RSA在智能卡中的应用.文中对Montgomery模乘算法进行了分析和改进,提出了一种新的适合于智能卡应用的高基模乘器结构.由于密码协处理器采用两个32位乘法器的并行流水结构,这与心动阵列结构相比它有效地降低了芯片的面积和模乘的时钟数,从而可在智能卡中实现RSA的数字签名与认证.实验表明:在基于0.35μm TSMC标准单元库工艺下,密码协处理器执行一次1024位模乘需1216个时钟周期,芯片设计面积为38k门.在5MHz的时钟频率下,加密1024位的明文平均仅需374ms.该设计与同类设计相比具有最小的模乘运算时钟周期数,并使芯片的面积降低了1/3.这个指标优于当今电子商务的密码协处理器,适合于智能卡应用. 相似文献

9.

Efficient Hardware Implementation Architectures for Long Integer Modular Multiplication over General Solinas Prime

Huai Zheang Zhou Jingbo Zhang Xinmiao 《Journal of Signal Processing Systems》2022,94(10):1067-1082

Journal of Signal Processing Systems - Modular multiplication of long integers is a key component of elliptic curve cryptography and homomorphic encryption. The multiplication complexity can be... 相似文献

10.

高性能Ed25519算法硬件架构设计与实现

于斌黄海刘志伟赵石磊那宁《电子与信息学报》2021,43(7):1821-1827

针对签名验签速度难以满足特定应用领域需求的问题,该文设计了一种高性能Ed25519算法的硬件实现架构。采用宽度为2 bit的窗口法实现标量乘运算,减少了标量乘所需的总周期数;通过优化点加倍点操作步骤,提高了乘法器的硬件使用率;使用低计算复杂度的快速模约简实现模乘,提高了整体运算速度。为了使模L运算可复用标量乘中的快速模约简,该文提出一种基于Barrett约简的模L算法。通过优化解压过程中模幂操作过程,精简了步骤并使其可复用模乘。对所提架构做硬件实现,在TSMC的55 nm CMOS工艺下,面积为746×103等效门,最高频率360 MHz,每秒能够执行公钥生成9.06×104次、签名8.82×104次和验签3.99×104次。相似文献

11.

基于双基表示的并列点乘算法 总被引：2，自引：1，他引：1

鲍皖苏陈辉《电子学报》2009,37(4):873-876

双基表示是一种整数表示法,它将任意整数表示成2和3的混合幂次的和或差的形式,并列点乘是一种快速的点乘算法,应用于一些椭圆曲线密码体制中.本文在现有的双基表示算法以及并列点乘算法的基础上,提出了新的双基表示算法以及基于该双基表示算法的并列点乘算法,该算法利用了一些特殊点的快速计算公式,从而有效地提高了并列点乘算法的执行效率.实验表明,在密钥长度为160比特,[S]/ =0.8时,当 /[M]=30,新算法的效率比基于JSF表示的并列点乘算法提高了22%;当 /[M]=10,新算法比JSF表示提高了6%;当 /[M]=8,新算法比JSF表示提高了3%. 相似文献

12.

基于椭圆曲线密码体制的广播多重数字签名算法

张秋霞《黑龙江电子技术》2012,(9):114-117

基于椭圆曲线密码体制的多重数字签名算法主要体现在密码强度、加/解密的运算速度以及存储开销上有较大优势,能用较短的密钥实现较高的安全强度,提出的算法克服以往因运算比较复杂而导致数字签名验证速度不佳的情况,在算法中减少了原来广播多重数字签名方案中相对复杂的乘法及点乘运算,避免了求逆运算,提高了验证速度的同时,又不影响签名的安全性。相似文献

13.

基于有符号数字系统的Montgomery模逆算法及其硬件实现 总被引：1，自引：0，他引：1

下载免费PDF全文

陈琳孙万忠陈性元戴紫彬《电子学报》2012,40(3):489-494

在椭圆曲线密码中,模逆运算是有限域运算中最复杂、最耗时且硬件实现难度最大的运算.本文在Kaliski算法的基础上,提出了基于有符号数字系统的Montgomery模逆算法,它支持素数域和二进制域上任意多精度参数的求模逆运算.据此算法,设计了相应的硬件结构方案,并给出了面积复杂度和时间复杂度分析.仿真结果表明,相比于其它模逆算法硬件设计方案,本文提出的基于有符号数字系统的Montgomery模逆算法在运算速度、电路面积、灵活性等方面具有显著的优越性. 相似文献

14.

有效的模(2n+1)乘算法及其VLSI设计

熊承义田金文柳健《信号处理》2006,22(5):703-706

模乘运算在剩余数值系统、数字信号处理系统及其它领域都具有广泛的应用,模乘法器的硬件实现具有重要的作用。提出了一种改进的模(2~n 1)余数乘法器的算法及其硬件结构,其输入为通常的二进制表示,因此无需另外的输人数据转换电路而可直接用于数字信号处理应用。通过利用模(2~n 1)运算的周期性简化其乘积项并重组求和项,以及采用改进的进位存储加法器和超前进位加法器优化结构以减少路径延时和硬件复杂度。比较其它同类设计,新的结构具有较好的面积、延时性能。相似文献

15.

A 286 MHz 64-b floating point multiplier with enhanced CG operation

Makino H. Suzuki H. Morinaka H. Nakase Y. Mashiko K. Sumi T. 《Solid-State Circuits, IEEE Journal of》1996,31(4):504-513

This paper presents a high speed 64-b floating point (FP) multiplier that has a useful function for computer graphics (CG). The critical path delay is minimized by using high speed logic gates and limiting the stage number of series transmission gates (TGs). The high speed redundant binary architecture is applied to the multiplication of significands. This FP multiplier has a special function of “CG multiplication” that directly multiplies a pixel data by an FP data. This multiplier was fabricated by 0.5-μm CMOS technology with triple-level metal of interconnection. The active area size is 4.2×5.1 mm². The operating cycle time is 3.5 ns at the supply voltage of 3.3 V, which corresponds to the frequency of 286 MHz. Implementation of CG multiplication increases the transistor count only 4%. Also, CG multiplication has no effect on the delay in the critical path 相似文献

16.

Low‐Power and Low‐Hardware Bit‐Parallel Polynomial Basis Systolic Multiplier over GF(2m) for Irreducible Polynomials

下载免费PDF全文

Sudha Ellison Mathe Lakshmi Boppana 《ETRI Journal》2017,39(4):570-581

Multiplication in finite fields is used in many applications, especially in cryptography. It is a basic and the most computationally intensive operation from among all such operations. Several systolic multipliers are proposed in the literature that offer low hardware complexity or high speed. In this paper, a bit‐parallel polynomial basis systolic multiplier for generic irreducible polynomials is proposed based on a modified interleaved multiplication method. The hardware complexity and delay of the proposed multiplier are estimated, and a comparison with the corresponding multipliers available in the literature is presented. Of the corresponding multipliers, the proposed multiplier achieves a reduction in the hardware complexity of up to 20% when compared to the best multiplier for m = 163. The synthesis results of application‐specific integrated circuit and field‐programmable gate array implementations of the proposed multiplier are also presented. From the synthesis results, it is inferred that the proposed multiplier achieves low power consumption and low area complexitywhen compared to the best of the corresponding multipliers. 相似文献

17.

Design of flexible GF(2/sup m/) elliptic curve cryptography processors

Benaissa M. Wei Ming Lim 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2006,14(6):659-662

The design of flexible elliptic curve cryptography processors (ECP) is considered in this paper. Novel word-level algorithms and implementations for the underlying GF(2/sup m/) multiplication and squaring arithmetic which enable improved flexibility versus performance tradeoffs, are presented and employed in the design of an efficient flexible ECP architecture; corresponding field-programmable gate-array (FPGA) prototyping results for two different processor word lengths are also included for evaluation. 相似文献

18.

一种Montgomery型椭圆曲线的高效标量乘算法 总被引：1，自引：0，他引：1

下载免费PDF全文

庞世春刘淑芬从福仲姚志林《电子学报》2011,39(4):865-868

椭圆曲线标量乘法是椭圆曲线密码系统的基本运算,安全高效的标量乘法将直接提高椭圆曲线密码系统的效率和安全性.本文将Fibonacei数列的概念进行了扩展,提出了Fibonacci型数列的概念,并用Fibonaeei型数列将Montgomery型曲线上点的加法运算公式进行了简化,得到了新的点加公式fibAdd.利用黄金比率... 相似文献

19.

Parallel Memory Architecture for Elliptic Curve Cryptography over $$ \mathbb{G}\mathbb{F}{\left( p \right)} $$ Aimed at Efficient FPGA Implementation

Ralf Laue Sorin A. Huss 《Journal of Signal Processing Systems》2008,51(1):39-55

Parallelization of operations is of utmost importance for efficient implementation of Public Key Cryptography algorithms. Starting with a classification of parallelization methods at different abstraction levels of public key algorithms, we propose a novel memory architecture for elliptic curve implementations with multiple modular multiplier units. This architecture is well-suited for different point addition and doubling algorithms over to be implemented on FPGAs. It allows the execution time to scale with the number of modular multipliers and exhibits nearly no overhead compared to the mere runtime of the multipliers. The advantages of this distributed memory architecture are demonstrated by means of two different point addition and doubling algorithms.

Sorin A. HussEmail:

相似文献

20.

Scalar Multiplication on Elliptic Curves by Frobenius Expansions

Jung Hee Cheon Sangjoon Park Choonsik Park Sang Geun Hahn 《ETRI Journal》1999,21(1):28-39

Koblitz has suggested to use “anomalous” elliptic curves defined over F₂, which are non-supersingular and allow for efficient multiplication of a point by an integer. For these curves, Meier and Staffelbach gave a method to find a polynomial of the Frobenius map corresponding to a given multiplier. Muller generalized their method to arbitrary non-supersingular elliptic curves defined over a small field of characteristic 2. In this paper, we propose an algorithm to speed up scalar multiplication on an elliptic curve defined over a small field. The proposed algorithm uses the same technique as Muller's to get an expansion by the Frobenius map, but its expansion length is half of Muller's due to the reduction step (Algorithm 1). Also, it uses a more efficient algorithm (Algorithm 3) to perform multiplication using the Frobenius expansion. Consequently, the proposed algorithm is two times faster than Muller's. Moreover, it can be applied to an elliptic curve defined over a finite field with odd characteristic and does not require any precomputation or additional memory. 相似文献