共查询到20条相似文献,搜索用时 31 毫秒
1.
《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2008,16(9):1162-1175
2.
This paper focuses on the design and implementation of a fast reconfigurable method for elliptic curve cryptography acceleration
in GF(2
m
). The main contribution of this paper is comparing different reconfigurable modular multiplication methods and modular reduction
methods for software implementation on Intel IA-32 processors, optimizing point arithmetic to reduce the number of expensive
reduction operations through a novel reduction sharing technique, and measuring performance for scalar point multiplication
in GF(2
m
) on Intel IA-32 processors. This paper determined that systematic reduction is best for fields defined with trinomials or
pentanomials; however, for fields defined with reduction polynomials with large Hamming weight Barrett reduction is best.
In GF(2571) for Intel P4 2.8 GHz processor, long multiplication with systematic reduction was 2.18 and 2.26 times faster than long multiplication
with Barrett or Montgomery reduction. This paper determined that Montgomery Invariant scalar point multiplication with Systematic
reduction in Projective coordinates was the fastest method for single scalar point multiplication for the NIST fields from
GF(2163) to GF(2571). For single scalar point multiplication on a reconfigurable elliptic curve cryptography accelerator, we were able to achieve
∼6.1 times speedup using reconfigurable reduction methods with long multiplication, Montgomery’s MSB Invariant method in projective
coordinates, and systematic reduction. Further extensions were made to implement fast reconfigurable elliptic curve cryptography
for repeated scalar point multiplication on the same base point. We also show that for L > 20 the LSB invariant method combined with affine doubling precomputation outperforms the LSB invariant method combined
with López-Dahab doubling precomputation for all reconfigurable reduction polynomial techniques in GF(2571) for Intel IA-32 processors. For L = 1000, the LSB invariant scalar point multiplication method was 13.78 to 34.32% faster than using the fastest Montgomery
Invariant scalar point multiplication method on Intel IA-32 processors. 相似文献
3.
《IEEE transactions on circuits and systems. I, Regular papers》2006,53(9):1946-1957
A novel hardware architecture for elliptic curve cryptography (ECC) over$ GF(p)$ is introduced. This can perform the main prime field arithmetic functions needed in these cryptosystems including modular inversion and multiplication. This is based on a new unified modular inversion algorithm that offers considerable improvement over previous ECC techniques that use Fermat's Little Theorem for this operation. The processor described uses a full-word multiplier which requires much fewer clock cycles than previous methods, while still maintaining a competitive critical path delay. The benefits of the approach have been demonstrated by utilizing these techniques to create a field-programmable gate array (FPGA) design. This can perform a 256-bit prime field scalar point multiplication in 3.86 ms, the fastest FPGA time reported to date. The ECC architecture described can also perform four different types of modular inversion, making it suitable for use in many different ECC applications. 相似文献
4.
Cheung R.C.C. Telle N.J. Luk W. Cheung P.Y.K. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2005,13(9):1048-1059
This paper presents a method for producing hardware designs for elliptic curve cryptography (ECC) systems over the finite field GF(2/sup m/), using the optimal normal basis for the representation of numbers. Our field multiplier design is based on a parallel architecture containing multiple m-bit serial multipliers; by changing the number of such serial multipliers, designers can obtain implementations with different tradeoffs in speed, size and level of security. A design generator has been developed which can automatically produce a customised ECC hardware design that meets user-defined requirements. To facilitate performance characterization, we have developed a parametric model for estimating the number of cycles for our generic ECC architecture. The resulting hardware implementations are among the fastest reported: for a key size of 270 bits, a point multiplication in a Xilinx XC2V6000 FPGA at 35 MHz can run over 1000 times faster than a software implementation on a Xeon computer at 2.6 GHz. 相似文献
5.
Lo’ai Ali Tawalbeh Abidalrahman Mohammad Adnan Abdul-Aziz Gutub 《Journal of Signal Processing Systems》2010,59(3):233-244
This paper presents a processor architecture for elliptic curve cryptography computations over GF(p). The speed to compute
the Elliptic-curve point multiplication over the prime fields GF(p) is increased by using the maximum degree of parallelism,
and by carefully selecting the most appropriate coordinates system. The proposed Elliptic Curve processor is implemented using
FPGAs. The time, area and throughput results are obtained, analyzed, and compared with previously proposed designs showing
interesting performance and features. 相似文献
6.
一种基于椭圆曲线的流水线实现方法 总被引:2,自引:2,他引:0
提出了一种基于椭圆曲线的流水线实现方法,来解决串行计算的效率低下问题.通过分析椭圆曲线密码运算的数据相关性,在不增加模乘器面积的前提下,采用三级流水线,提高了椭圆曲线密码的运算速度,并给出适用于椭圆曲线密码VLSI设计的流水线的实现流程. 相似文献
7.
标量乘及多标量乘算法是影响椭圆曲线密码系统性能的关键.基于二进制Edwards曲线提出并实现了一种新型的椭圆曲线标量乘法器.由于Edwards曲线的完备性,这种乘法器可对曲线上任意一点进行计算,而不用区分倍乘或者负元,实现较简单,有很高的运算速度和很强的抗侧信道攻击的能力. 相似文献
8.
RSA密码协处理器的实现 总被引:11,自引:0,他引:11
密码协处理器的面积过大和速度较慢制约了公钥密码体制RSA在智能卡中的应用.文中对Montgomery模乘算法进行了分析和改进,提出了一种新的适合于智能卡应用的高基模乘器结构.由于密码协处理器采用两个32位乘法器的并行流水结构,这与心动阵列结构相比它有效地降低了芯片的面积和模乘的时钟数,从而可在智能卡中实现RSA的数字签名与认证.实验表明:在基于0.35μm TSMC标准单元库工艺下,密码协处理器执行一次1024位模乘需1216个时钟周期,芯片设计面积为38k门.在5MHz的时钟频率下,加密1024位的明文平均仅需374ms.该设计与同类设计相比具有最小的模乘运算时钟周期数,并使芯片的面积降低了1/3.这个指标优于当今电子商务的密码协处理器,适合于智能卡应用. 相似文献
9.
Journal of Signal Processing Systems - Modular multiplication of long integers is a key component of elliptic curve cryptography and homomorphic encryption. The multiplication complexity can be... 相似文献
10.
针对签名验签速度难以满足特定应用领域需求的问题,该文设计了一种高性能Ed25519算法的硬件实现架构。采用宽度为2 bit的窗口法实现标量乘运算,减少了标量乘所需的总周期数;通过优化点加倍点操作步骤,提高了乘法器的硬件使用率;使用低计算复杂度的快速模约简实现模乘,提高了整体运算速度。为了使模L运算可复用标量乘中的快速模约简,该文提出一种基于Barrett约简的模L算法。通过优化解压过程中模幂操作过程,精简了步骤并使其可复用模乘。对所提架构做硬件实现,在TSMC的55 nm CMOS工艺下,面积为746×103等效门,最高频率360 MHz,每秒能够执行公钥生成9.06×104次、签名8.82×104次和验签3.99×104次。 相似文献
11.
基于双基表示的并列点乘算法 总被引:2,自引:1,他引:1
双基表示是一种整数表示法,它将任意整数表示成2和3的混合幂次的和或差的形式,并列点乘是一种快速的点乘算法,应用于一些椭圆曲线密码体制中.本文在现有的双基表示算法以及并列点乘算法的基础上,提出了新的双基表示算法以及基于该双基表示算法的并列点乘算法,该算法利用了一些特殊点的快速计算公式,从而有效地提高了并列点乘算法的执行效率.实验表明,在密钥长度为160比特,[S]/ =0.8时,当 /[M]=30,新算法的效率比基于JSF表示的并列点乘算法提高了22%;当 /[M]=10,新算法比JSF表示提高了6%;当 /[M]=8,新算法比JSF表示提高了3%. 相似文献
12.
基于椭圆曲线密码体制的多重数字签名算法主要体现在密码强度、加/解密的运算速度以及存储开销上有较大优势,能用较短的密钥实现较高的安全强度,提出的算法克服以往因运算比较复杂而导致数字签名验证速度不佳的情况,在算法中减少了原来广播多重数字签名方案中相对复杂的乘法及点乘运算,避免了求逆运算,提高了验证速度的同时,又不影响签名的安全性。 相似文献
13.
在椭圆曲线密码中,模逆运算是有限域运算中最复杂、最耗时且硬件实现难度最大的运算.本文在Kaliski算法的基础上,提出了基于有符号数字系统的Montgomery模逆算法,它支持素数域和二进制域上任意多精度参数的求模逆运算.据此算法,设计了相应的硬件结构方案,并给出了面积复杂度和时间复杂度分析.仿真结果表明,相比于其它模逆算法硬件设计方案,本文提出的基于有符号数字系统的Montgomery模逆算法在运算速度、电路面积、灵活性等方面具有显著的优越性. 相似文献
14.
15.
Makino H. Suzuki H. Morinaka H. Nakase Y. Mashiko K. Sumi T. 《Solid-State Circuits, IEEE Journal of》1996,31(4):504-513
This paper presents a high speed 64-b floating point (FP) multiplier that has a useful function for computer graphics (CG). The critical path delay is minimized by using high speed logic gates and limiting the stage number of series transmission gates (TGs). The high speed redundant binary architecture is applied to the multiplication of significands. This FP multiplier has a special function of “CG multiplication” that directly multiplies a pixel data by an FP data. This multiplier was fabricated by 0.5-μm CMOS technology with triple-level metal of interconnection. The active area size is 4.2×5.1 mm2. The operating cycle time is 3.5 ns at the supply voltage of 3.3 V, which corresponds to the frequency of 286 MHz. Implementation of CG multiplication increases the transistor count only 4%. Also, CG multiplication has no effect on the delay in the critical path 相似文献
16.
Low‐Power and Low‐Hardware Bit‐Parallel Polynomial Basis Systolic Multiplier over GF(2m) for Irreducible Polynomials
下载免费PDF全文
![点击此处可从《ETRI Journal》网站下载免费的PDF全文](/ch/ext_images/free.gif)
Multiplication in finite fields is used in many applications, especially in cryptography. It is a basic and the most computationally intensive operation from among all such operations. Several systolic multipliers are proposed in the literature that offer low hardware complexity or high speed. In this paper, a bit‐parallel polynomial basis systolic multiplier for generic irreducible polynomials is proposed based on a modified interleaved multiplication method. The hardware complexity and delay of the proposed multiplier are estimated, and a comparison with the corresponding multipliers available in the literature is presented. Of the corresponding multipliers, the proposed multiplier achieves a reduction in the hardware complexity of up to 20% when compared to the best multiplier for m = 163. The synthesis results of application‐specific integrated circuit and field‐programmable gate array implementations of the proposed multiplier are also presented. From the synthesis results, it is inferred that the proposed multiplier achieves low power consumption and low area complexitywhen compared to the best of the corresponding multipliers. 相似文献
17.
Benaissa M. Wei Ming Lim 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2006,14(6):659-662
The design of flexible elliptic curve cryptography processors (ECP) is considered in this paper. Novel word-level algorithms and implementations for the underlying GF(2/sup m/) multiplication and squaring arithmetic which enable improved flexibility versus performance tradeoffs, are presented and employed in the design of an efficient flexible ECP architecture; corresponding field-programmable gate-array (FPGA) prototyping results for two different processor word lengths are also included for evaluation. 相似文献
18.
19.
Parallelization of operations is of utmost importance for efficient implementation of Public Key Cryptography algorithms.
Starting with a classification of parallelization methods at different abstraction levels of public key algorithms, we propose
a novel memory architecture for elliptic curve implementations with multiple modular multiplier units. This architecture is
well-suited for different point addition and doubling algorithms over to be implemented on FPGAs. It allows the execution time to scale with the number of modular multipliers and exhibits nearly
no overhead compared to the mere runtime of the multipliers. The advantages of this distributed memory architecture are demonstrated
by means of two different point addition and doubling algorithms.
相似文献
Sorin A. HussEmail: |
20.
Koblitz has suggested to use “anomalous” elliptic curves defined over F2, which are non-supersingular and allow for efficient multiplication of a point by an integer. For these curves, Meier and Staffelbach gave a method to find a polynomial of the Frobenius map corresponding to a given multiplier. Muller generalized their method to arbitrary non-supersingular elliptic curves defined over a small field of characteristic 2. In this paper, we propose an algorithm to speed up scalar multiplication on an elliptic curve defined over a small field. The proposed algorithm uses the same technique as Muller's to get an expansion by the Frobenius map, but its expansion length is half of Muller's due to the reduction step (Algorithm 1). Also, it uses a more efficient algorithm (Algorithm 3) to perform multiplication using the Frobenius expansion. Consequently, the proposed algorithm is two times faster than Muller's. Moreover, it can be applied to an elliptic curve defined over a finite field with odd characteristic and does not require any precomputation or additional memory. 相似文献