共查询到20条相似文献,搜索用时 31 毫秒
1.
Benaissa M. Wei Ming Lim 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2006,14(6):659-662
The design of flexible elliptic curve cryptography processors (ECP) is considered in this paper. Novel word-level algorithms and implementations for the underlying GF(2/sup m/) multiplication and squaring arithmetic which enable improved flexibility versus performance tradeoffs, are presented and employed in the design of an efficient flexible ECP architecture; corresponding field-programmable gate-array (FPGA) prototyping results for two different processor word lengths are also included for evaluation. 相似文献
2.
This paper focuses on the design and implementation of a fast reconfigurable method for elliptic curve cryptography acceleration
in GF(2
m
). The main contribution of this paper is comparing different reconfigurable modular multiplication methods and modular reduction
methods for software implementation on Intel IA-32 processors, optimizing point arithmetic to reduce the number of expensive
reduction operations through a novel reduction sharing technique, and measuring performance for scalar point multiplication
in GF(2
m
) on Intel IA-32 processors. This paper determined that systematic reduction is best for fields defined with trinomials or
pentanomials; however, for fields defined with reduction polynomials with large Hamming weight Barrett reduction is best.
In GF(2571) for Intel P4 2.8 GHz processor, long multiplication with systematic reduction was 2.18 and 2.26 times faster than long multiplication
with Barrett or Montgomery reduction. This paper determined that Montgomery Invariant scalar point multiplication with Systematic
reduction in Projective coordinates was the fastest method for single scalar point multiplication for the NIST fields from
GF(2163) to GF(2571). For single scalar point multiplication on a reconfigurable elliptic curve cryptography accelerator, we were able to achieve
∼6.1 times speedup using reconfigurable reduction methods with long multiplication, Montgomery’s MSB Invariant method in projective
coordinates, and systematic reduction. Further extensions were made to implement fast reconfigurable elliptic curve cryptography
for repeated scalar point multiplication on the same base point. We also show that for L > 20 the LSB invariant method combined with affine doubling precomputation outperforms the LSB invariant method combined
with López-Dahab doubling precomputation for all reconfigurable reduction polynomial techniques in GF(2571) for Intel IA-32 processors. For L = 1000, the LSB invariant scalar point multiplication method was 13.78 to 34.32% faster than using the fastest Montgomery
Invariant scalar point multiplication method on Intel IA-32 processors. 相似文献
3.
一种基于椭圆曲线的流水线实现方法 总被引:2,自引:2,他引:0
提出了一种基于椭圆曲线的流水线实现方法,来解决串行计算的效率低下问题.通过分析椭圆曲线密码运算的数据相关性,在不增加模乘器面积的前提下,采用三级流水线,提高了椭圆曲线密码的运算速度,并给出适用于椭圆曲线密码VLSI设计的流水线的实现流程. 相似文献
4.
In this article, a parallel hardware processor is presented to compute elliptic curve scalar multiplication in polynomial basis representation. The processor is applicable to the operations of scalar multiplication by using a modular arithmetic logic unit (MALU). The MALU consists of two multiplications, one addition, and one squaring. The two multiplications and the addition or squaring can be computed in parallel. The whole computations of scalar multiplication over GF(2163) can be performed in 3 064 cycles. The simulation results based on Xilinx Virtex2 XC2V6000 FPGAs show that the proposed design can compute random GF(2163) elliptic curve scalar multiplication operations in 31.17 μs, and the resource occupies 3 994 registers and 15 527 LUTs, which indicates that the crypto-processor is suitable for high-performance application. 相似文献
5.
This paper presents the design and implementation of a hyperelliptic curve cryptography (HECC) coprocessor over affine and projective coordinates, along with measurements of its performance, hardware complexity, and power consumption. We applied several design techniques, including parallelism, pipelining, and loop unrolling, in designing field arithmetic units, group operation units, and scalar multiplication units to improve the performance and power consumption. Our affine and projective coordinate‐based HECC processors execute in 0.436 ms and 0.531 ms, respectively, based on the underlying field GF(289). These results are about five times faster than those for previous hardware implementations and at least 13 times better in terms of area‐time products. Further results suggest that neither case is superior to the other when considering the hardware complexity and performance. The characteristics of our proposed HECC coprocessor show that it is applicable to high‐speed network applications as well as resource‐constrained environments, such as PDAs, smart cards, and so on. 相似文献
6.
List decoding of q-ary Reed-Muller codes 总被引:2,自引:0,他引:2
Pellikaan R. Xin-Wen Wu 《IEEE transactions on information theory / Professional Technical Group on Information Theory》2004,50(4):679-682
The q-ary Reed-Muller (RM) codes RM/sub q/(u,m) of length n=q/sup m/ are a generalization of Reed-Solomon (RS) codes, which use polynomials in m variables to encode messages through functional encoding. Using an idea of reducing the multivariate case to the univariate case, randomized list-decoding algorithms for RM codes were given in and . The algorithm in Sudan et al. (1999) is an improvement of the algorithm in , it is applicable to codes RM/sub q/(u,m) with u
相似文献
7.
Optimized FPGA-based elliptic curve cryptography processor for high-speed applications 总被引:1,自引:0,他引:1
Kimmo JärvinenAuthor vitae 《Integration, the VLSI Journal》2011,44(4):270-279
In this paper, we introduce an FPGA-based processor for elliptic curve cryptography on Koblitz curves. The processor targets specifically to applications requiring very high speed. The processor is optimized for performing scalar multiplications, which are the basic operations of every elliptic curve cryptosystem, only on one specific Koblitz curve; the support for other curves is achieved by reconfiguring the FPGA. We combine efficient methods from various recent papers into a very efficient processor architecture. The processor includes carefully designed processing units dedicated for different parts of the scalar multiplication in order to increase performance. The computation is pipelined providing simultaneous processing of up to three scalar multiplications. We provide experimental results on an Altera Stratix II FPGA demonstrating that the processor computes a single scalar multiplication on average in and achieves a throughput of 235,550 scalar multiplications per second on NIST K-163. 相似文献
8.
Codes from the Suzuki function field 总被引:1,自引:0,他引:1
Matthews G.L. 《IEEE transactions on information theory / Professional Technical Group on Information Theory》2004,50(12):3298-3302
We construct algebraic geometry (AG) codes from the function field F(2/sup 2n+1/)(x,y)/F(2/sup 2n+1/) defined by y(2/sup 2n+1/)-y=(x(2/sup 2n+/)-x) where n is a positive integer. These codes are supported by two places, and many have parameters that are better than those of any comparable code supported by one place of the same function field. To define such codes, we determine and exploit the structure of the Weierstrass gap set of an arbitrary pair of rational places of F(2/sup 2n+1/)(x,y)/F(2/sup 2n+1/). Moreover, we find some codes over F/sub 8/ with parameters that are better than any known code. 相似文献
9.
Lo’ai Ali Tawalbeh Abidalrahman Mohammad Adnan Abdul-Aziz Gutub 《Journal of Signal Processing Systems》2010,59(3):233-244
This paper presents a processor architecture for elliptic curve cryptography computations over GF(p). The speed to compute
the Elliptic-curve point multiplication over the prime fields GF(p) is increased by using the maximum degree of parallelism,
and by carefully selecting the most appropriate coordinates system. The proposed Elliptic Curve processor is implemented using
FPGAs. The time, area and throughput results are obtained, analyzed, and compared with previously proposed designs showing
interesting performance and features. 相似文献
10.
Cheung R.C.C. Telle N.J. Luk W. Cheung P.Y.K. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2005,13(9):1048-1059
This paper presents a method for producing hardware designs for elliptic curve cryptography (ECC) systems over the finite field GF(2/sup m/), using the optimal normal basis for the representation of numbers. Our field multiplier design is based on a parallel architecture containing multiple m-bit serial multipliers; by changing the number of such serial multipliers, designers can obtain implementations with different tradeoffs in speed, size and level of security. A design generator has been developed which can automatically produce a customised ECC hardware design that meets user-defined requirements. To facilitate performance characterization, we have developed a parametric model for estimating the number of cycles for our generic ECC architecture. The resulting hardware implementations are among the fastest reported: for a key size of 270 bits, a point multiplication in a Xilinx XC2V6000 FPGA at 35 MHz can run over 1000 times faster than a software implementation on a Xeon computer at 2.6 GHz. 相似文献
11.
Elixir: High-Throughput Cost-Effective Dual-Field Processors and the Design Framework for Elliptic Curve Cryptography 总被引:1,自引:0,他引:1
《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2008,16(11):1567-1580
12.
《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2008,16(9):1162-1175
13.
《Solid-State Circuits, IEEE Journal of》1985,20(3):730-740
This paper describes the architecture and design methodology used to produce a new custom IC intended for automatic document analysis. The circuit implements the entire operative part of a dedicated microprogrammed processor for the next generation of page readers which include items such as Optical Character Recognition (OCR) and different codings for graphics and images. The chip provides a wide range of powerful functions, performing up to three operations per cycle. It includes about 10 000 transistor sites and occupies an area of 20 mm/sup 2/. A standard 6-/spl mu/m NMOS technology was used. Typical clock frequency is 2 MHz. The layout was obtained using a highly regular architecture and some automatically generated structures. New CAD tools provided an efficient and short design procedure. 相似文献
14.
A change of representation for elements in F/sub 2m/ is proposed. The proposed representation is useful for architectures that implement unified Montgomery multiplication in finite fields F/sub 2m/ and F/sub p/ used for elliptic curve cryptography since it transforms a standard F/sub 2m/ multiplication into a Montgomery multiplication and comes at virtually no cost in terms of conversion operations. 相似文献
15.
Eslami Y. Sheikholeslami A. Gulak P.G. Masui S. Mukaida K. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2006,14(1):43-56
Cryptography circuits for smart cards and portable electronic devices provide user authentication and secure data communication. These circuits should, in general, occupy small chip area, consume low power, handle several cryptography algorithms, and provide acceptable performance. This paper presents, for the first time, a hardware implementation of three standard cryptography algorithms on a universal architecture. The microcoded cryptography processor targets smart card applications and implements both private key and public key algorithms and meets the power and performance specifications and is as small as 2.25 mm/sup 2/ in 0.18-/spl mu/m 6LM CMOS. A new algorithm is implemented by changing the contents of the memory blocks that are implemented in ferroelectric RAM (FeRAM). Using FeRAM allows nonvolatile storage of the configuration bits, which are changed only when a new algorithm instantiation is required. 相似文献
16.
Hwa-Joon Oh Mueller S.M. Jacobi C. Tran K.D. Cottier S.R. Michael B.W. Nishikawa H. Totsuka Y. Namatame T. Yano N. Machida T. Dhong S.H. 《Solid-State Circuits, IEEE Journal of》2006,41(4):759-771
The floating-point unit (FPU) in the synergistic processor element (SPE) of a CELL processor is a fully pipelined 4-way single-instruction multiple-data (SIMD) unit designed to accelerate media and data streaming with 128-bit operands. It supports 32-bit single-precision floating-point and 16-bit integer operands with two different latencies, six-cycle and seven-cycle, with 11 FO4 delay per stage. The FPU optimizes the performance of critical single-precision multiply-add operations. Since exact rounding, exceptions, and de-norm number handling are not important to multimedia applications, IEEE correctness on the single-precision floating-point numbers is sacrificed for performance and simple design. It employs fine-grained clock gating for power saving. The design has 768K transistors in 1.3 mm/sup 2/, fabricated SOI in 90-nm technology. Correct operations have been observed up to 5.6 GHz with 1.4 V and 56/spl deg/C, delivering 44.8 GFlops. Architecture, logic, circuits, and integration are codesigned to meet the performance, power, and area goals. 相似文献
17.
Agnew G.B. Mullin R.C. Vanstone S.A. 《Selected Areas in Communications, IEEE Journal on》1993,11(5):804-813
The authors describe a VLSI Galois field processor and how it can be applied to the implementation of elliptic curve groups. They demonstrate the feasibility of constructing very fast, and very secure, public key systems with a relatively simple device, and the possibility of putting such a system on a smart card. The registers necessary to implement the elliptic curve system will require less than 1 mm2 (or less than 4%) of the area available on the card 相似文献
18.
A 64-point Fourier transform chip for high-speed wireless LAN application using OFDM 总被引:1,自引:0,他引:1
In this paper, we present a novel fixed-point 16-bit word-width 64-point FFT/IFFT processor developed primarily for the application in an OFDM-based IEEE 802.11a wireless LAN baseband processor. The 64-point FFT is realized by decomposing it into a two-dimensional structure of 8-point FFTs. This approach reduces the number of required complex multiplications compared to the conventional radix-2 64-point FFT algorithm. The complex multiplication operations are realized using shift-and-add operations. Thus, the processor does not use a two-input digital multiplier. It also does not need any RAM or ROM for internal storage of coefficients. The proposed 64-point FFT/IFFT processor has been fabricated and tested successfully using our in-house 0.25-/spl mu/m BiCMOS technology. The core area of this chip is 6.8 mm/sup 2/. The average dynamic power consumption is 41 mW at 20 MHz operating frequency and 1.8 V supply voltage. The processor completes one parallel-to-parallel (i.e., when all input data are available in parallel and all output data are generated in parallel) 64-point FFT computation in 23 cycles. These features show that though it has been developed primarily for application in the IEEE 802.11a standard, it can be used for any application that requires fast operation as well as low power consumption. 相似文献
19.
《IEEE transactions on circuits and systems. I, Regular papers》2006,53(9):1946-1957
A novel hardware architecture for elliptic curve cryptography (ECC) over$ GF(p)$ is introduced. This can perform the main prime field arithmetic functions needed in these cryptosystems including modular inversion and multiplication. This is based on a new unified modular inversion algorithm that offers considerable improvement over previous ECC techniques that use Fermat's Little Theorem for this operation. The processor described uses a full-word multiplier which requires much fewer clock cycles than previous methods, while still maintaining a competitive critical path delay. The benefits of the approach have been demonstrated by utilizing these techniques to create a field-programmable gate array (FPGA) design. This can perform a 256-bit prime field scalar point multiplication in 3.86 ms, the fastest FPGA time reported to date. The ECC architecture described can also perform four different types of modular inversion, making it suitable for use in many different ECC applications. 相似文献
20.
高速双有限域加密协处理器设计 总被引:10,自引:3,他引:7
文章提出了一种能够同时在有限域GF(P)和GF(2^m)中高速实现椭圆曲线密码算法(ECC)的协处理器。该协处理器能够高速完成椭圆曲线密码算法中各种基本的运算。通过调用这些基本的模运算指令,可以实现各种ECC上的加密算法。该协处理器支持512位以下任意长度的模运算。协处理器工作速度很快,整个协处理器综合采用了多种加速结构和算法并采用了流水线结构设计。根据物理综合的结果,协处理器可以工作在300MHz的频率,运算时间比此前的一些同类芯片快4到10倍左右。 相似文献