期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Novel Radix Finite Field Multiplier for GF(2m)

Mekhallalati M.C. Ashur A.S. Ibrahim M.K. 《Journal of Signal Processing Systems》1997,15(3):233-245

In this paper, a new High-Radix Finite Field multiplication algorithm for GF(2^m) is proposed for the first time. The proposed multiplication algorithm can operate in a Digit-serial fashion, and hence can give a trade-off between the speed, the area , the input/output pin limitation, and the low power consumption by simply varying the digit size. A detailed example of a new Radix-16 GF(2^m) Digit-Serial multiplication architecture adopting the proposed algorithm illustrates a speed improvement of 75% when compared to conventional Radix-2 bit-serial realization. This is made more significant when it is noted that the speed improvement of 75% was achieved at the expense of only 2.3 times increase in the hardware requirements of the proposed architecture. 相似文献

2.

Low-complexity bit-parallel systolic multipliers over GF(2)

Chiou-Yng Lee^{Author Vitae} 《Integration, the VLSI Journal》2008,41(1):106-112

This paper presents new time-dependent and time-independent multiplication algorithms over finite fields GF(2^m) by employing an interleaved conventional multiplication and a folded technique. The proposed algorithm allows efficient realization of the bit-parallel systolic multipliers. The results show that the proposed time-independent multiplier saves about 54% space complexity as compared to other related multipliers for polynomial and dual bases of GF(2^m). The proposed architectures include the features of regularity, modularity and local interconnection. Accordingly, it is well suited for VLSI implementation. 相似文献

3.

Design and Implementation of a Polynomial Basis Multiplier Architecture Over GF(2m)

Huong Ho 《Journal of Signal Processing Systems》2014,75(3):203-208

In this paper, the design and circuit implementation of a polynomial basis multiplier architecture over Galois Fields GF(2^m) is presented. The proposed architecture supports field multiplication of two m-term polynomials where m is a positive integer. Circuit implementations based on this parameterized architecture where m is configurable is suitable for applications in error control coding and cryptography. The proposed architecture offers low latency, polynomial basis multiplication where the irreducible polynomial P(x)?=?x ^m?+?p _kt.?x ^kt?+?…?+?p ₁.?x?+?1 with m ≥ kt + 4 is dynamically reconfigurable. Results of the complexity analysis show that the proposed architecture requires less logic resources compared to existing sequential polynomial basis multipliers. In terms of timing performance, the proposed architecture has a latency of m/4, which is the lowest among the multipliers found in literature for GF(2^m). 相似文献

4.

Design and implementation of dual-field modular inversion algorithm

Yang Xiaohui Yang Tongjie Dai Zibin Zhang Yongfu 《电子科学学刊(英文版)》2010,27(4):531-537

Modular inverse arithmetic plays an important role in elliptic curve cryptography. Based on the analysis of Montgomery modular inversion algorithm, this paper presents a new dual-field modular inversion algorithm, and a novel scalable and unified architecture for Montgomery inverse hardware in finite fields GF(p) and GF(2ⁿ) is proposed. Furthermore, this architecture based on the new modular inversion algorithm has been verified by modeling it in Verilog-HDL, and accomplished it under 0.18 μm CMOS technology. The result indicates that our work has better performance and flexibility than other works. 相似文献

5.

High-Speed Recursion Architectures for MAP-Based Turbo Decoders

Wang Z. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2007,15(4):470-474

The maximum a posterior probability (MAP) algorithm has been widely used in Turbo decoding for its outstanding performance. However, it is very challenging to design high-speed MAP decoders because of inherent recursive computations. This paper presents two novel high-speed recursion architectures for MAP-based Turbo decoders. Algorithmic transformation, approximation, and architectural optimization are incorporated in the proposed designs to reduce the critical path. Simulations show that neither of the proposed designs has observable decoding performance loss compared to the true MAP algorithm when applied in Turbo decoding. Synthesis results show that the proposed Radix-2 recursion architecture can achieve comparable processing speed to that of the state-of-the-art recursion (Radix-4) architecture with significantly lower complexity while the proposed Radix-4 architecture is 32% faster than the best existing design 相似文献

6.

Low Power Semi-systolic Architectures for Polynomial-Basis Multiplication over GF(2m) Using Progressive Multiplier Reduction

Atef Ibrahim Fayez Gebali 《Journal of Signal Processing Systems》2016,82(3):331-343

We present low area and low power semi-systolic array architectures for polynomial basis multiplication over GF(2^m) using Progressive Multiplier Reduction Technique (PMR). These architectures are explored using linear and nonlinear techniques applied to the polynomial multiplication algorithm. The nonlinear techniques allow the designer, to control the processor workload and reduce the inter-processor communications. The semi-systolic architectures obtained have simple structure with local communication. ASIC implementations of our designs and comparable published designs show that the proposed scalable semi-systolic structures have less area complexity (56.8–94.6 %) and power consumption (55.2–84.2 %) except for a scalable design published by the same authors. However, one of the proposed scalable designs outperforms this design in terms of throughput by 73.8 %. This makes the proposed designs suited to embedded applications that require low power consumption and moderate speed. 相似文献

7.

High speed merged array multiplication

Farhad Fuad Islam Keikichi Tamaru 《The Journal of VLSI Signal Processing》1995,10(1):41-52

Multiplication-accumulation operations described by represent the fundamental computation involved in many digital signal processing algorithms. For high speed signal processing, one obvious approach to realize the above computation in VLSI is to employm discrete multipliers working in parallel. However, a more area efficient approach is offered by the merged multiplication technique [5]. But the principal drawback of the conventional merged technique is its longer latency than the former discrete approach. This work proposes a hardware algorithm for merged array multiplication which eliminates this drawback and achieves significant improvement in latency when compared with the conventional scheme for merged multiplication. The proposed algorithm utilizes multiple wave front computation as opposed to the traditional approach where computation in an array multiplier is carried out by a single wave front. The improvement in latency by the proposed approach is greater than 40% (form>2) when compared with a conventional approach to merged multiplication. The consequent cost in the form of additional requirement of VLSI area is found to be rather small. In this paper, we provide a thorough analytic discussion on the proposed algorithm and support it by experimental results. 相似文献

8.

High-Speed Array Multipliers Based on On-the-Fly Conversion

Sang-Man Moh Suk-Han Yoon 《ETRI Journal》1997,19(4):317-325

A new on-the-fly conversion algorithm is proposed, and high-speed array multipliers with the on-the-fly conversion are presented. The new on-the-fly conversion logic is used to speed up carry-propagate addition at the last stage of multiplication, and provides constant delay independent of the number of input bits. In this paper, the multiplication architecture and the on-the-fly conversion algorithm are presented and discussed in detail. The proposed architecture has multiplication time of (n + 1)t_FA, where n is the number of input bits and t_FA is the delay of a full adder. According to our comparative performance evaluation, the proposed architecture has shorter delay and requires less area than the conventional array multiplier with on-the-fly conversion. 相似文献

9.

用于全同态加密的数论变换乘法蝶形运算优化及实现

华斯亮张惠国王书昶《电子与信息学报》2021,43(5):1381-1388

全同态加密(FHE)可以真正从根本上解决云计算时将数据及其操作委托给第三方时的数据安全问题。针对全同态加密中占较大比例的大整数乘法运算优化需求,该文提出一种数论变换乘法蝶形运算的操作数合并算法,利用取模操作的快速算法,分别可将基16和基32运算单元的操作数减少到43.8%和39.1%。在此基础上,设计并实现了数论变换基32运算单元的硬件设计架构,在SMIC 90 nm工艺下的综合结果显示,电路的最高工作频率为600 MHz,面积1.714 mm2。实验结果表明,该优化算法提升了数论变换乘法蝶形运算的计算效率。相似文献

10.

Efficient Arithmetic in Finite Field Extensions with Application in Elliptic Curve Cryptography

Daniel V. Bailey Christof Paar 《Journal of Cryptology》2001,14(3):153-176

This contribution focuses on a class of Galois field used to achieve fast finite field arithmetic which we call an Optimal Extension Field (OEF), first introduced in [3]. We extend this work by presenting an adaptation of Itoh and Tsujii's algorithm for finite field inversion applied to OEFs. In particular, we use the facts that the action of the Frobenius map in GF (p ^m ) can be computed with only m-1 subfield multiplications and that inverses in GF (p) may be computed cheaply using known techniques. As a result, we show that one extension field inversion can be computed with a logarithmic number of extension field multiplications. In addition, we provide new extension field multiplication formulas which give a performance increase. Further, we provide an OEF construction algorithm together with tables of Type I and Type II OEFs along with statistics on the number of pseudo-Mersenne primes and OEFs. We apply this new work to provide implementation results using these methods to construct elliptic curve cryptosystems on both DEC Alpha workstations and Pentium-class PCs. These results show that OEFs when used with our new inversion and multiplication algorithms provide a substantial performance increase over other reported methods. Received 7 July 1999 and revised 29 March 2000 Online publication 15 September 2000 相似文献

11.

An implementation for a fast public-key cryptosystem 总被引：9，自引：0，他引：9

G. B. Agnew R. C. Mullin I. M. Onyszchuk S. A. Vanstone 《Journal of Cryptology》1991,3(2):63-79

In this paper we examine the development of a high-speed implementation of a system to perform exponentiation in fields of the form GF(2ⁿ). For sufficiently large n, this device has applications in public-key cryptography. The selection of representation and observations on the structure of multiplication have led to the development of an architecture which is of low complexity and high speed. A VLSI implementation has being fabricated with measured throughput for exponentiation for cryptographic purposes of approximately 300 kilobits per second. 相似文献

12.

用于全同态加密的数论变换乘法蝶形运算优化及实现

华斯亮张惠国王书昶《电子与信息学报》2022,43(5):1381-1388

全同态加密(FHE)可以真正从根本上解决云计算时将数据及其操作委托给第三方时的数据安全问题.针对全同态加密中占较大比例的大整数乘法运算优化需求,该文提出一种数论变换乘法蝶形运算的操作数合并算法,利用取模操作的快速算法,分别可将基16和基32运算单元的操作数减少到43.8%和39.1%.在此基础上,设计并实现了数论变换基32运算单元的硬件设计架构,在SMIC 90 nm工艺下的综合结果显示,电路的最高工作频率为600 MHz,面积1.714 mm2.实验结果表明,该优化算法提升了数论变换乘法蝶形运算的计算效率. 相似文献

13.

New Bit-Parallel Systolic Architectures for Computing Multiplication,Multiplicative Inversion and Division in GF(2<Superscript>m</Superscript>) Under Polynomial Basis and Normal Basis Representations

Chiou-Yng Lee Che Wun Chiou 《Journal of Signal Processing Systems》2008,52(3):313-324

A new bit-parallel systolic multiplier over GF(2^m) under the polynomial basis and normal basis is proposed. This new circuit is constructed by m ² identical cells, each of which consists of one two-input AND gate, one three-input XOR gate and five 1-bit latches. Especially, the proposed architecture is without the basis conversion as compared to the well-known multipliers with the redundant representation. With this proposed multiplier, a parallel-in parallel-out systolic array has also been developed for computing inversion and division over GF(2^m). The proposed architectures are well suited to VLSI systems due to their regular interconnection pattern and modular structure.

Che Wun ChiouEmail:

相似文献

14.

High‐throughput and low‐area implementation of orthogonal matching pursuit algorithm for compressive sensing reconstruction

Vu Quan Nguyen Woo Hyun Son Marek Parfieniuk Luong Tran Nhat Trung Sang Yoon Park 《ETRI Journal》2020,42(3):376-387

Massive computation of the reconstruction algorithm for compressive sensing (CS) has been a major concern for its real‐time application. In this paper, we propose a novel high‐speed architecture for the orthogonal matching pursuit (OMP) algorithm, which is the most frequently used to reconstruct compressively sensed signals. The proposed design offers a very high throughput and includes an innovative pipeline architecture and scheduling algorithm. Least‐squares problem solving, which requires a huge amount of computations in the OMP, is implemented by using systolic arrays with four new processing elements. In addition, a distributed‐arithmetic‐based circuit for matrix multiplication is proposed to counterbalance the area overhead caused by the multi‐stage pipelining. The results of logic synthesis show that the proposed design reconstructs signals nearly 19 times faster while occupying an only 1.06 times larger area than the existing designs for N = 256, M = 64, and m = 16, where N is the number of the original samples, M is the length of the measurement vector, and m is the sparsity level of the signal. 相似文献

15.

An approach for fixed coefficient RNS-based FIR filter

Kotha Srinivasa Reddy Subhendu Kumar Sahoo 《International Journal of Electronics》2013,100(8):1358-1376

相似文献

16.

Scalable Gaussian Normal Basis Multipliers over GF(2<Superscript><Emphasis Type="Italic">m</Emphasis></Superscript>) Using Hankel Matrix-Vector Representation

Chiou-Yng Lee Che Wun Chiou 《Journal of Signal Processing Systems》2012,69(2):197-211

This work presents a novel scalable multiplication algorithm for a type-t Gaussian normal basis (GNB) of GF(2^m). Utilizing the basic characteristics of MSD-first and LSD-first schemes with d-bit digit size, the GNB multiplication can be decomposed into n(n + 1) Hankel matrix-vector multiplications. where n = (mt + 1)/d. The proposed scalable architectures for computing GNB multiplication comprise of one d × d Hankel multiplier, four registers and one final reduction polynomial circuit. Using the relationship of the basis conversion from the GNB to the normal basis, we also present the modified scalable multiplier which requires only nk Hankel multiplications, where k = mt/2d if m is even or k = (mt − t + 2)/2d if m is odd. The developed scalable multipliers have the feature of scalability. It is shown that, as the selected digit size d ≥ 8, the proposed scalable architectures have significantly lower time-area complexity than existing digit-serial multipliers. Moreover, the proposed architectures have the features of regularity, modularity, and local interconnection ability. Accordingly, they are well suited for VLSI implementation. 相似文献

17.

An efficient tree architecture for modulo 2ⁿ+1 multiplication

Zhongde Wang G. A. Jullien W. C. Miller 《The Journal of VLSI Signal Processing》1996,14(3):241-248

Modulo 2ⁿ+1 multiplication plays an important role in the Fermat number transform and residue number systems; the diminished-1 representation of numbers has been found most suitable for representing the elements of the rings. Existing algorithms for modulo (2ⁿ+1) multiplication either use recursive modulo (2ⁿ+1) addition, or a regular binary multiplication integrated with the modulo reduction operation. Although most often adopted for largen, this latter approach requires conversions between the diminished-1 and binary representations. In this paper we propose a parallel fine-grained architecture, based on a Wallace tree, for modulo (2ⁿ+1) multiplication which does not require any conversions; the use of a Wallace tree considerably improves the speed of the multiplier. This new architecture exhibits an extremely modular structure with associated VLSI implementation advantages. The critical path delay and the hardware requirements of the new multiplier are similar to that of a correspondingn×n bit binary multiplier. 相似文献

18.

GF(2ⁿ)域上的一种Ⅱ型优化正规基乘法器及其FPGA实现 总被引：1，自引：0，他引：1

下载免费PDF全文

方冰樊海宁戴一奇《电子学报》2002,30(Z1):2045-2048

有限域GF(2ⁿ)上的椭圆曲线密码体制以其密钥短,安全强度高的优点正在获得广泛的重视和应用.该密码体制最主要的运算是有限域上的乘法运算.本文提出了一种基于Ⅱ型优化正规基的乘法器,该乘法器具有Massey-Omura乘法器的优点,又避免了其不足,易于编程,适合FPGA实现.实验表明,该算法简单,快速. 相似文献

19.

Concurrent Error Detection in a Bit-Parallel Systolic Multiplier for Dual Basis of GF(2<Superscript><Emphasis Type="Italic">m</Emphasis></Superscript>)

Chiou-Yng?Lee Email author Che?Wun?Chiou Jim-Min?Lin 《Journal of Electronic Testing》2005,21(5):539-549

The finite field is widely used in error-correcting codes and cryptography. Among its important arithmetic operations, multiplication is identified as the most important and complicated. Therefore, a multiplier with concurrent error detection ability is elegantly needed. In this paper, a concurrent error detection scheme is presented for bit-parallel systolic dual basis multiplier over GF(2^m) according to the Fenn’s multiplier in [7]. Although, the proposed method increases the space complexity overhead about 27% and the latency overhead about one extra clock cycle as compared to Fenn’s multiplier. Our analysis shows that all single stuck-at faults can be detected concurrently. 相似文献

20.

LFSR multipliers over GF(2) defined by all-one polynomial

Hyun-Sung Sung-Woon 《Integration, the VLSI Journal》2007,40(4):473-478

This paper presents two bit-serial modular multipliers based on the linear feedback shift register using an irreducible all one polynomial (AOP) over GF(2^m). First, a new multiplication algorithm and its architecture are proposed for the modular AB multiplication. Then a new algorithm and architecture for the modular AB² multiplication are derived based on the first multiplier. They have significantly smaller hardware complexity than the previous multipliers because of using the property of AOP. It simplifies the modular reduction compared with the case of using the generalized irreducible polynomial. Since the proposed multipliers have low hardware requirements and regular structures, they are suitable for VLSI implementation. The proposed multipliers can be used as the kernel architecture for the operations of exponentiation, inversion, and division. 相似文献