首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 375 毫秒
1.
该文针对广泛应用的TLS1.3协议,提出了一种高性能的椭圆曲线密码处理器.该处理器支持TLS1.3协议中定义的两类素数域椭圆曲线的通用模数.通过对高基蒙哥马利算法的改进,提出了一种支持521 bit及以下位宽的模乘运算单元,并提出了一种双模乘单元并行结构的标量乘法器.基于该结构在两类椭圆曲线下设计了雅阁比坐标系下并行的点运算时序排布,使模乘单元的利用率在不同点运算情况下达到100%,95.4%和86.5%.与现有设计相比,本文中标量乘法运算消耗的周期更少,运算单元利用率更高,在相似的时间面积乘积前提下,具有更强的通用性和可配置性的优势.在TSMC 55 nm CMOS工艺下达到454 MHz的时钟频率,等效逻辑门数851k,Secp256r1曲线的标量乘运算速度为31 230 times/s.  相似文献   

2.
移位器单元是数字信号处理器中的重要运算部件,提出了一种桶形移位器的设计实现,应用新的移位器级同互联方案,并采用改进的2-1选择器对其电路的结构做了进一步优化.实验及仿真结果表明,提出的方案对移位器的功耗和面积有了较大改善,从而有效地提高了处理器的性能.采用本移位器设计的数字信号处理器已在SMIC0.18μm CMOS工艺下流片成功.  相似文献   

3.
张子骥  贺雅娟  张波 《微电子学》2021,51(4):552-556
设计了一种基于多电压的高能效近似DCT电路.通过对DCT的运算单元和系数进行近似处理,以精简电路规模的方式降低了 DCT单次运算能耗.为了进一步提升近似DCT电路的相比能效,部分单元低压供电,在不降低DCT性能的前提下,实现了更低的功耗.基于0.18 μm CMOS工艺的DCT电路对比显示,相较于标准电压的近似设计和全...  相似文献   

4.
RSA密码协处理器的实现   总被引:11,自引:0,他引:11  
李树国  周润德  冯建华  孙义和 《电子学报》2001,29(11):1441-1444
密码协处理器的面积过大和速度较慢制约了公钥密码体制RSA在智能卡中的应用.文中对Montgomery模乘算法进行了分析和改进,提出了一种新的适合于智能卡应用的高基模乘器结构.由于密码协处理器采用两个32位乘法器的并行流水结构,这与心动阵列结构相比它有效地降低了芯片的面积和模乘的时钟数,从而可在智能卡中实现RSA的数字签名与认证.实验表明:在基于0.35μm TSMC标准单元库工艺下,密码协处理器执行一次1024位模乘需1216个时钟周期,芯片设计面积为38k门.在5MHz的时钟频率下,加密1024位的明文平均仅需374ms.该设计与同类设计相比具有最小的模乘运算时钟周期数,并使芯片的面积降低了1/3.这个指标优于当今电子商务的密码协处理器,适合于智能卡应用.  相似文献   

5.
一种高速CMOS全差分运算放大器   总被引:8,自引:2,他引:6  
朱小珍  朱樟明  柴常春 《半导体技术》2006,31(4):287-289,299
设计并讨论了一种高速CMOS全差分运算放大器.设计中采用了折叠共源共栅结构、连续时间共模反馈以及独特的偏置电路,以期达到高速及良好的稳定性.基于TSMC 0.25 μ m CMOS工艺,仿真结果表明,在2.5V的单电源电压下,运算放大器的直流开环增益为71.9dB,单位增益带宽为495MHz(CL=0.5pF),建立时间为24ns,功耗为3.9mW.  相似文献   

6.
分析了基于FIPS的乘加器结构的VLSI实现随着操作数宽度的变化,速度和面积的变化趋势; 提出了一种改进FIPS算法,解决了采用流水线结构的数据通路导致的数据迟滞问题.在SMIC 0.18 μm CMOS工艺下,基于该改进算法,设计了一个128位操作数位宽的模乘器,与基于原算法的设计相比,硬件面积增加约5%,效率提高了约42%.利用该模乘器进行1 024位RSA运算时,速度可达1.1 Mbps.  相似文献   

7.
介绍了一种用于指纹识别专用集成电路(ASIC)的乘法器模块的设计.该乘法器模块能够处理32位的有符号数、无符号数的乘法和乘加运算.电路采用基-4的Booth编码以及改进型压缩器阵列结构.采用提出的迭代和阵列结合的结构算法,可节省芯片面积30%,提高工作频率24%.模块电路在TSMC 0.25 μm工艺上实现.该乘法器模块易于移植到其他数字处理系统.  相似文献   

8.
该文提出了一种应用于移动顶点处理器的高性能低功耗定点特殊函数运算单元电路。该运算单元支持嵌入式图形标准OpenGL ES 1.X的定点数据格式,并支持小数点后16位精度的倒数、均方根、倒数均方根、对数和指数等初等函数运算。初等函数采用分段二次多项式插值方法近似计算,系数处理中引入2-1/2运算电路,相对于传统的设计在相同的精度下使整体的二次多项式查找表大小减少了29%。优化二次多项式插值算法的计算误差和截断误差,使电路的查找表大小、平方器、乘法器和加法器的面积、速度达到最优。该电路采用0.18m 的CMOS工艺实现,面积为0.112 mm2,芯片时钟频率达到300 MHz,功耗仅为12.8 mW。测试结果表明该定点特殊函数运算单元非常适合移动图形顶点处理器的初等函数计算应用。  相似文献   

9.
王自强  王志华 《半导体技术》2004,29(11):65-67,60
设计了用于无线接收机的中频变增益放大器.该放大器由运算放大器和电阻反馈网络组成.分析了闭环变增益放大器产生失真的原因,通过提高输出电阻的线性等方法降低了输出大信号的失真.设计的全差分变增益放大器使用韩国"东部"CMOS 0.25 μ m工艺,电源电压3.3V,在2Vpp差分输出下,失真低于-80dB,放大器功耗3mW.  相似文献   

10.
数据加密引擎   总被引:1,自引:0,他引:1  
随着通信技术的发展,数据加密变得越来越重要.由于数据加密要进行大量的运算,如果用软件来实现的话,速度太慢.因此我们用ASIC技术设计了DES加密芯片.DES有许多不同模式,该引擎实现了DES最重要的几种加密模式.我们用0.5 μ m的CMOS工艺来实现,完成一次加密需要16个时钟周期,100MHz时钟频率下,最高可以达到400Mb/s.  相似文献   

11.
An error tolerant hardware efficient very large scale integration (VLSI) architecture for bit parallel systolic multiplication over dual base, which can be pipelined, is presented. Since this architecture has the features of regularity, modularity and unidirectional data flow, this structure is well suited to VLSI implementations. The length of the largest delay path and area of this architecture are less compared to the bit parallel systolic multiplication architectures reported earlier. The architecture is implemented using Austria Micro System's 0.35 m CMOS (complementary metal oxide semiconductor) technology. This architecture can also operate over both the dual-base and polynomial base.  相似文献   

12.
An error tolerant hardware efficient verylarge scale integration (VLSI) architecture for bitparallel systolic multiplication over dual base, which canbe pipelined, is presented. Since this architecture has thefeatures of regularity, modularity and unidirectionaldata flow, this structure is well suited to VLSIimplementations. The length of the largest delay pathand area of this architecture are less compared to the bitparallel systolic multiplication architectures reportedearlier. The architecture is implemented using Austria Micro System's 0.35 μm CMOS (complementary metaloxide semiconductor) technology. This architecture canalso operate over both the dual-base and polynomialbase.  相似文献   

13.
The authors report a VLSI design of an advanced systolic array graphics (SAG) engine built from pipelined functional units which can generate realistic images interactively for high-resolution displays. They introduce a structured frame store system as an environment for the advanced SAG engine and present the principles and architecture of the advanced SAG engine. They introduce pipelined functional units into this SAG engine to meet the performance requirements. This is done by a formal approach where the original systolic array is represented at bit level by a finite, vertex-weighted, edge-weighted, directed graph. Two architectures built from pipelined functional units are described. A prototype containing nine processing elements was fabricated in a 1.6-μm CMOS technology  相似文献   

14.
提出了一种支持可变位宽高效加法的现场可编程逻辑门阵列(FPGA)嵌入式数字信号处理(DSP)单元知识产权(IP)硬核结构,相比于Altera公司的Stratix-III DSP结构,基于本文提出的优化结构可以更高效地实现加法、乘加以及累加等多种应用。利用软件对不同数据类型和位宽的输入实现数据预处理,减小了硬件资源的开销,并进一步提升了电路性能。同时在DSP结构中加入了乘法旁路器和二级符号位扩展的加法电路,在减小DSP实现面积的同时,支持超高位宽、高速的流水线型加法运算,扩展了DSP的应用范围。采用TSMC 55 nm标准CMOS工艺设计并完成了所提出的DSP IP核的电路实现,可实现包括72位可变位宽加法及36位可变位宽乘法等在内的9种运算模式。  相似文献   

15.
In this paper, an efficient digit-serial systolic array is proposed for multiplication in finite field GF(2/sup m/) using the standard basis representation. From the least significant bit first multiplication algorithm, we obtain a new dependence graph and design an efficient digit-serial systolic multiplier. If input data come in continuously, the proposed array can produce multiplication results at a rate of one every /spl lceil/m/L/spl rceil/ clock cycles, where L is the selected digit size. Analysis shows that the computational delay time of the proposed architecture is significantly less than the previously proposed digit-serial systolic multiplier. Furthermore, since the new architecture has the features of regularity, modularity, and unidirectional data flow, it is well suited to VLSI implementation.  相似文献   

16.
A new parallel architecture is presented that is more flexible than the systolic array: the Instruction Systolic Array (ISA). In the ISA the instructions (instead of data, as in a systolic array) are pumped through an array of processors. While systolic arrays are special purpose architectures, the ISA is more universal: It is capable of executing different programs. The Instruction Systolic Array is well suited for implementation in VLSI technology.  相似文献   

17.
高性能可扩展公钥密码协处理器研究与设计   总被引:1,自引:0,他引:1       下载免费PDF全文
黎明  吴丹  戴葵  邹雪城 《电子学报》2011,39(3):665-670
 本文提出了一种高效的点乘调度策略和改进的双域高基Montgomery模乘算法,在此基础上设计了一种新型高性能可扩展公钥密码协处理器体系结构,并采用0.18μm 1P6M标准CMOS工艺实现了该协处理器,以支持RSA和ECC等公钥密码算法的计算加速.该协处理器通过扩展片上高速存储器和使用以基数为处理字长的方法,具有良好的可扩展性和较强的灵活性,支持2048位以内任意大数模幂运算以及576位以内双域任意椭圆曲线标量乘法运算.芯片测试结果表明其具有很好的加速性能,完成一次1024位模幂运算仅需197μs、GF(p)域192位标量乘法运算仅需225μs、GF(2m)域163位标量乘法运算仅需200.7μs.  相似文献   

18.
This paper describes the work carried out in the RACE Project R2039 ATMOS (asynchronous transfer mode optical switching). The project is briefly illustrated, together with its main goal: to develop and assess concepts and technology suitable for optical fast packet switching. The project's technical approach consisted in the exploitation of the space and wavelength domains for fast routing and buffering: The major achievements are then reported. Four different switch architecture concepts have been proposed, investigated and developed, all based on a high speed optical routing matrix electrically controlled at lower speed. The basic optical key components and subsystems (wavelength converters, space switches and optical buffers) are described in detail, with the outstanding results obtained and the corresponding projected performance. In particular, system demonstration of wavelength conversion at 10 and 20 Gb/s has been realized, to show the usefulness of the ATMOS technology both to implement optimized high performance optical packet-switching fabrics as well as transparent optical circuit-routing nodes. Four rack-mounted, reduced size demonstrators of basic switching matrices have been designed and implemented scalable to real system sizes. The obtained good results in terms of bit error rate and hardware integration are reported, showing that ATM switches are feasible with state of-the-art optical technology  相似文献   

19.
胡科 《电讯技术》2006,46(3):149-152
介绍了自动车辆定位系统中的技术体系结构。讨论了核心技术通信中的FSK/FM组合调制与非相干解调的基本方法,分析了影响系统误比特率的几个因素。给出了不同条件下的组合FSK/FM调制解调特性比较。  相似文献   

20.
Affine transformation is widely used in image processing. Recently, it is recommended by MPEG-4 for video motion compensation. This paper presents a novel low power parallel architecture for texture warping using affine transformation (AT). The architecture uses a novel multiplication-free algorithm that employs the algebraic properties of the AT. Low power has been achieved at different levels of the design. At the algorithmic level, replacing multiplication operations with bit shifting saves the power and delay of using a multiplier. At the architecture level, low power is achieved by using parallel computational units, where the latency constraints and/or the operating latency can be reduced. At the circuit level, using low power building blocks (such as low power adders) contributes to the power savings. The proposed architecture is used as a computational kernel in video object coders. It is compatible with MPEG-4 and VRML standards. The architecture has been prototyped in 0.6 m CMOS technology with three layers of metal. The performance of the proposed architecture shows that it can be used in mobile and handheld applications.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号