首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到10条相似文献,搜索用时 109 毫秒
1.
An architecture based on the RSA public key cryptography algorithm is presented. The circuit includes two components, one for modular squaring and one for modular multiplication. Each component is based on the Montgomery algorithm and implements the modular operations using two modified serial-parallel multipliers. A full modular exponentiation is completed every n(n + 3) clock cycles. All circuits are systolic, operate with 100% efficiency and their maximum combinational delay is equal to one gated Full-Adder. Thus, high-speed performance is achieved while the low cell hardware complexity enables an efficient VLSI implementation.  相似文献   

2.
This article is concerned with various arithmetic operations inGF(2 m ). In particular we discuss techniques for computing multiplicative inverses and doing exponentiation. The method used for exponentiation is highly suited to parallel computation. All methods achieve much of their efficiency from exploiting a normal basis representation in the field.  相似文献   

3.
Modulo 2 n +1 multiplication plays an important role in the Fermat number transform and residue number systems; the diminished-1 representation of numbers has been found most suitable for representing the elements of the rings. Existing algorithms for modulo (2 n +1) multiplication either use recursive modulo (2 n +1) addition, or a regular binary multiplication integrated with the modulo reduction operation. Although most often adopted for largen, this latter approach requires conversions between the diminished-1 and binary representations. In this paper we propose a parallel fine-grained architecture, based on a Wallace tree, for modulo (2 n +1) multiplication which does not require any conversions; the use of a Wallace tree considerably improves the speed of the multiplier. This new architecture exhibits an extremely modular structure with associated VLSI implementation advantages. The critical path delay and the hardware requirements of the new multiplier are similar to that of a correspondingn×n bit binary multiplier.  相似文献   

4.
In this paper, a fully-pipeline linear systolic array based on adjusted Montgomery's algorithm is presented to perform modular multiplication at extremely high speed. The processing element (PE) consists of only 4 full-adders and 14 flip-flops. Three-stage internal pipelined PE results in a very short critical path with only a one-bit full-adder delay. Thus, it can run at a very high cycle rate. The total execution time for an n-bit modular multiplication is 2n + 11 cycles with only (n/2 + 2) PEs. A modular exponentiation based on it takes (3n + 16.5)n cycles in average. Compared with most published VLSI modular multipliers, the hardware complexity is greatly reduced while keeping very high throughput. Therefore it is a good candidate of the arithmetic units used in the many public-key crypto-systems, e.g. RSA, Elliptic Curve and so on, especially for the embedded applications concerning information security.  相似文献   

5.
Signal Processing algorithms generally rely heavily on the convolution operation which in turn is multiplication intensive. However, more recently convolution algorithms based on the squaring operation as opposed to the multiplication operation have been developed. In this article we present two ROM based methods for performing the squaring operation modulo 2 n , modulo 2 n −1, or modulo 2 n +1. The performance, cost, and implementation issues of the two methods are analyzed in detail and compared against each other as well as with a traditional ROM based implementation. It is shown that both methods obtain ROM bit savings of 99.99%, for 32-bit word lengths, when compared with traditional techniques. However, one of the methods outperforms the other in all other respects such as overhead costs, of up to 99.48% savings, performance, up to about 20 times faster, and regularity and simplicity of hardware design.  相似文献   

6.
In this paper, we describe a fixed‐threshold sequential minimal optimization (FSMO) for structured SVM problems. FSMO is conceptually simple, easy to implement, and faster than the standard support vector machine (SVM) training algorithms for structured SVM problems. Because FSMO uses the fact that the formulation of structured SVM has no bias (that is, the threshold b is fixed at zero), FSMO breaks down the quadratic programming (QP) problems of structured SVM into a series of smallest QP problems, each involving only one variable. By involving only one variable, FSMO is advantageous in that each QP sub‐problem does not need subset selection. For the various test sets, FSMO is as accurate as an existing structured SVM implementation (SVM‐Struct) but is much faster on large data sets. The training time of FSMO empirically scales between O ( n ) and O(n1.2), while SVM‐Struct scales between O(n1.5) and O(n1.8).  相似文献   

7.
Architectures for designing single constant multipliers in Residue Number System (RNS) for moduli of the 2 n −1, 2 n and 2 n  + 1 forms are introduced with the constant operand being recoded in Signed-Digit representation. Two methodologies are proposed. In the first one a straightforward implementation of the shift-and-add algorithm is adopted, while in the second one a graph-based approach is used. Both methodologies result in circuits that are shown to be efficient in terms of area and delay.  相似文献   

8.
This paper reports on the implementation of carrier‐selective tunnel oxide passivated rear contact for high‐efficiency screen‐printed large area n‐type front junction crystalline Si solar cells. It is shown that the tunnel oxide grown in nitric acid at room temperature (25°C) and capped with n+ polysilicon layer provides excellent rear contact passivation with implied open‐circuit voltage iVoc of 714 mV and saturation current density J0b of 10.3 fA/cm2 for the back surface field region. The durability of this passivation scheme is also investigated for a back‐end high temperature process. In combination with an ion‐implanted Al2O3‐passivated boron emitter and screen‐printed front metal grids, this passivated rear contact enabled 21.2% efficient front junction Si solar cells on 239 cm2 commercial grade n‐type Czochralski wafers. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

9.
Two-pattern tests target the detection of most common failure mechanisms in cmos vlsi circuits, which are modeled as stuck-open or delay faults. In this paper the Accumulator-Based Two-pattern generation (ABT) algorithm is presented, that generates an exhaustive n-bit two-pattern test within exactly 2 n × (2 n – 1) + 1 clock cycles, i.e. within the theoretically minimum time. The ABT algorithm is implemented in hardware utilizing an accumulator whose inputs are driven by either a binary counter (counter-based implementation) or a Linear Feedback Shift Register (LFSR-based implementation). With the counter-based implementation different modules, having different number of inputs, can be efficiently tested using the same generator. For circuits that do not contain counters, the LFSR-based implementation can be implemented, since registers (that typically drive the accumulator inputs into dapatapath cores) can be easily modified to LFSRS with small increase in the hardware overhead. The great advantage of the presented scheme is that it can be implemented by augmening existing datapath components, rather than building a new pattern generation structure.  相似文献   

10.
Scaling is an important operation because of the iterative nature of arithmetic processes in digital signal processors (DSPs). In residue number system (RNS)–based DSPs, scaling represents a performance bottleneck based on the complexity of inter‐modulo operations. To design an efficient RNS scaler for special moduli sets, a body of literature has been dedicated to the study of the well‐known moduli sets {2n ? 1, 2n, 2n + 1} and {2n, 2n ? 1, 2n+1 ? 1}, and their extension in vertical or horizontal forms. In this study, we propose an efficient programmable RNS scaler for the arithmetic‐friendly moduli set {2n+p, 2n ? 1, 2n+1 ? 1}. The proposed algorithm yields high speed and energy‐efficient realization of an RNS programmable scaler based on the effective exploitation of the mixed‐radix representation, parallelism, and a hardware sharing technique. Experimental results obtained for a 130 nm CMOS ASIC technology demonstrate the superiority of the proposed programmable scaler compared to the only available and highly effective hybrid programmable scaler for an identical moduli set. The proposed scaler provides 43.28% less power consumption, 33.27% faster execution, and 28.55% more area saving on average compared to the hybrid programmable scaler.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号