期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

童元满王志英戴葵陆洪毅《小型微型计算机系统》2007,28(2):243-246

在信息安全领域中,公钥密码算法具有广泛的应用.模乘、模加(减)为公钥密码算法的关键操作,出于性能上的考虑,往往以协处理器的方式来实现这些操作.针对公钥密码算法的运算特点,本文提出了一种可扩展公钥密码协处理器体系结构以及软硬件协同流水工作方式,并且改进了模加(减)操作的实现方法,可以有效支持公钥密码算法.同时,该协处理器体系结构也可根据不同的硬件复杂度及性能设计折衷要求,进行灵活扩展. 相似文献

2.

基于RISC-V的SM4算法扩展指令的设计与实现

李晨琪袁国材樊荣《计算机与数字工程》2022,50(2):256-260

为更好地在资源有限终端实现SM4密码算法,论文基于开源RISC-V指令集及VexRiscv处理器,设计实现SM4算法扩展指令集,包括两条SM4算法扩展指令分别对应SM4算法密钥扩展部分及密码算法部分,以低硬件资源开销换取基于软件实现SM4密码算法时更高的吞吐量.论文设计实现的SM4密码算法扩展指令,通过使用Xilinx... 相似文献

3.

基于蓝牙单芯片的复杂密码算法实现机制

黄一才郁滨《计算机应用》2012,32(12):3453-3455

在深入分析蓝牙芯片内部结构的基础上,结合蓝牙芯片工作特点,设计了基于数字信号处理器(DSP)协处理器密码算法指令并行结构模型和算法工作的过程。该模型综合考虑算法存储空间和时间开销两方面的性能,将计算量大、复杂度高的密码算法利用DSP实现。实验结果表明,该方法可以减小密码算法对蓝牙传输性能的影响,解决了蓝牙单芯片实现复杂算法的问题。相似文献

4.

SM2算法软件实现的安全性分析与防护

王腾飞张海峰许森《计算机应用研究》2021,38(9):2811-2815

国家商用密码标准SM2是以椭圆曲线密码学为基础的公钥密码体制,在软件实现的过程中可能面临敏感数据侧信道泄露的风险.为了提高SM2算法在实际应用中的安全性,针对基于多精度整数和有理算术C语言库(MIRACL)的SM2软件实现,利用缓存计时攻击方法进行了分析.提出监测地址的选取策略,尽可能避免因缓存块大小、时间精度以及数据预取技术带来的误差,并根据泄露点提出改进的固定时长防护方案.实验表明,在以同样方式实施的缓存计时攻击条件下,固定时长的标量乘函数比MIRACL库提供的标量乘函数能够更好地保护SM2中的敏感数据.说明基于MIRACL函数库实现的SM2算法需要采取必要的防护手段,才能具备抵御缓存计时攻击的安全性. 相似文献

5.

国密SM2加密算法的RCCA安全设计

陈荣茂王毅黄欣沂《中国科学:信息科学》2023,(2):266-281

国密SM2密码算法已经成为保障我国网络信息系统安全自主可控的关键技术.然而近期研究发现, SM2加密算法在实际部署应用时面临高效的算法替换攻击.该种攻击可以从当前的密文预测下一次加密所使用的随机数,从而可以在不知道解密密钥的情况下成功解密后续密文.密码逆向防火墙技术已被证实可以有效抵抗该种攻击,但其要求密文具有可重随机性,与SM2加密算法本身所具备的CCA (chosen-ciphertext attack)安全性相冲突.针对该问题,本文改进SM2加密算法,构造了具有RCCA (可重放CCA)安全性的公钥加密方案.该方案具有与SM2加密算法近似的安全性,且同时支持密文重随机操作,因此可以有效兼容密码逆向防火墙.方案的设计遵循Phan等提出的OAEP三轮构造范式,结合SM2加密算法进行改进,并在随机预言机模型下给出了严谨的安全证明.本文提出了首个基于国密算法的可重随机RCCA公钥加密方案,研究结果有助于提升SM2密码算法在实际应用中的安全性. 相似文献

6.

GF(2m)上椭圆曲线密码协处理器的快速实现

杨先文李峥《计算机工程与设计》2008,29(5):1086-1088

在分析椭圆曲线密码体制的基础上,给出了椭圆曲线密码体制基本运算单元的硬件设计方案,基于FPGA实现了一种GF(2m)上椭圆曲线密码协处理器.采用双端口RAM技术完成了协处理器与微控制器的挂接,并且根据微控制器不同的指令调度,协处理器能够完成椭圆曲线密码体制5种基本运算操作.实现结果表明,该协处理器能够适应160≤m≤400范围内任意有限域的选取,能较好地满足数字签名和数据加解密中的应用要求. 相似文献

7.

用ARM和FPGA实现无线局域网的安全接入

杨峰李景峰刘飞祝跃飞《微计算机信息》2005,(20):111-113

无线局域网安全标准的发展对AP(access point)实现提出了更高的要求.本文通过分析几种新型WLAN安全标准的特点,设计了一种基于Samsung公司的S3C2510微处理器的硬件系统,用于实现WLAN的安全接入.针对AP系统密码运算能力不够的瓶颈,在硬件系统中增加了密码协处理器.该密码协处理器用FPGA芯片实现,具有良好的可扩展性,较好的解决了这个问题. 相似文献

8.

SM2密码算法的Java实现与评测

聂意新刘彬彬任伟《信息网络安全》2013,(8)

文章介绍了SM2椭圆曲线公钥密码算法,分析了SM2算法的实现流程与技术手段,并使用Java语言实现了SM2算法的四大功能：SM2数字签名、SM2签名验证、SM2加密、SM2解密。文章还测试了算法的性能,论证了算法的实用性。相似文献

9.

基于SM2的标识认证密钥交换协议

下载免费PDF全文

王晓虎林超伍玮《信息安全学报》2024,9(2):84-95

会话密钥(Session Secret Key,SSK)可在远程实现各方之间的安全通信,在实际的开放网络部署中具有重要地位。传统SSK主要是基于公钥基础设施的认证密钥交换(Authenticated Key Exchange,AKE)协议构建的,因涉及证书的颁发、更新、撤销等繁琐操作,面临昂贵的计算、通信和存储开销。虽然基于标识(Identity,ID)的AKE (ID-AKE)协议可解决这个问题,但目前的大部分ID-AKE协议均基于国外密码算法设计,尚未见基于国产商用密码算法的ID-AKE协议在国内外刊物上正式发表,不符合我国密码核心技术自主可控的要求。SM2认证密钥交换(Authenticated Key Exchange From SM2,SM2-AKE)协议因具有高安全和高效率的特性,在商用密码中得到广泛应用。但证书管理开销问题仍未被解决,这将极大限制了SM2-AKE协议的应用与推广。文章于标识密码(Identity-based Cryptography,IBC)体系下采用类Schnorr签名密钥生成方法,基于SM2设计了一种标识认证密钥交换(SM2-ID-AKE)协议,并在CDH安全假设和随机谕言模型下证明了该协议的安全性。最后的理论分析和仿真实验结果表明,与现有的ID-AKE协议相比,文章协议至少节省66.67%的通信带宽和34.05%的计算开销,有效降低和减轻了系统的代价和负担,更能够适应网络通讯部署等领域下不同用户的安全通信服务需求。相似文献

10.

国产商用密码SM2应用研究

王森《网络安全技术与应用》2022,(10):22-25

随着我国信息安全自主可控要求越来越高,SM2及相应的国产密码算法发挥了重要作用。目前SM2还存在应用不充分、管理不规范等问题。本文介绍了SM2算法的原理,详解了SM2应用实践。在身份鉴别应用中,利用SM2数字证书实现了用户和服务器的双向安全鉴别;在电子印章应用中,利用电子签章实现了电子文件权威认证;在可信计算中,利用可信根签名和信任链实现了可信系统部件状态的可信认证。相似文献

11.

一种通用安全协处理器

下载免费PDF全文

孙季丰袁春林盛艳青刘斌《计算机工程》2008,34(22):168-170

基于加解密算法中访存频繁、循环执行与其边界和数据运算长度存在一一对应关系的特性,提出一个快速实现多种算法的指令集,其中包括基于该指令集五级流水硬件的实现。从软件和硬件层面上设计并实现一个完整的通用安全协处理器原型系统。实验表明该协处理器具有良好的结构和功能。相似文献

12.

并行CISC指令译码器的设计与实现*

张骏樊晓桠张萌《计算机应用研究》2007,24(11):200-202

针对x86系列兼容微处理器串行译码速度慢、效率低的缺点,提出了一种并行译码器设计方案.该方案将整个译码过程分为长度译码和地址译码两个阶段进行流水译码,在指令不带前缀的情况下单拍完成长度译码,支持任意两条指令并行译码,提高了译码效率.其使用Verilog-HDL进行描述,SYNOPSYS-DV在SMIC CMOS 0.18工艺库下进行综合.结果表明完全达到了设计要求. 相似文献

13.

Using conditional execution to exploit instruction level concurrency

Rod Adams Sue Gray 《Software》1995,25(9):1003-1020

Multiple-instruction-issue processors seek to improve performance over scalar RISC processors by providing multiple pipelined functional units in order to fetch, decode and execute several instructions per cycle. The process of identifying instructions which can be executed in parallel and distributing them between the available functional units is referred to as instruction scheduling. This paper describes a simple compile-time scheduling technique, called conditional compaction, which uses the concept of conditional execution to move instructions across basic block boundaries. It then presents the results of an investigation into the performance of the scheduling technique using C benchmark programs scheduled for machines with different functional unit configurations. This paper represents the culmination of our investigation into how much performance improvement can be obtained using conditional execution as the sole scheduling technique. 相似文献

14.

一种提高同时多线程VLIW处理器中取指单元吞吐率的方法

下载免费PDF全文

万江华陈书明《计算机工程与科学》2007,29(6):97-101

在同时多线程处理器中,提高取指单元的吞吐率意味着各线程之间的Cache竞争更加激烈,而这种竞争又制约着取指单元吞吐率的提高。本文针对当前超长指令字体系结构的新特点,提出了一种同时提高取指单元和处理器吞吐率的方法。该方法通过尽可能早地作废取指流水线中的无效地址,减少了由无效取指导致的程序Cache冲突,也提高了整个处理器的性能。实验结果表明,该方法使处理器和取指单元的吞吐率均相对提高了12%～23%,而一级程序Cache的失效率则略微增加甚至降低。另外,它还能够减少10%～25%的一级程
程序Cache读访问,从而降低了处理器的功耗。相似文献

15.

VLIW处理器循环指令缓冲器设计与实现

李勇胡慧俐杨焕荣《计算机应用》2014,34(4):1005-1009

数字信号处理软件中循环程序在执行时间上占有很大比例,用指令缓冲器暂存循环代码可以减少程序存储器的访问次数,提高处理器性能。在VLIW处理器指令流水线中增加一个支持循环指令的缓冲器,该缓冲器能够缓存循环程序指令,并以软件流水的形式向功能部件派发循环程序指令。这样循环程序代码只需访存一次而执行多次,大大减少了访存次数。在循环指令运行期间,缓冲器发出信号使程序存储器进入睡眠状态可以降低处理器功耗。典型的应用程序测试表明,使用了循环缓冲后,取指流水线空闲率可达90%以上,处理器整体性能提高10%左右,而循环缓冲的硬件面积开销大约占取指流水线的9%。相似文献

16.

Variable Length Instruction Compression on Transport Triggered Architectures

Timo Viitanen Janne Helkala Heikki Kultala Pekka Jääskeläinen Jarmo Takala Tommi Zetterman Heikki Berg 《International journal of parallel programming》2018,46(6):1283-1303

The memories used for embedded microprocessor devices consume a large portion of the system’s power. The power dissipation of the instruction memory can be reduced by using code compression methods, which may require the use of variable length instruction formats in the processor. The power-efficient design of variable length instruction fetch and decode is challenging for static multiple-issue processors, which aim for low power consumption on embedded platforms. The memory-side power savings using compression are easily lost on inefficient fetch unit design. We propose an implementation for instruction template-based compression and two instruction fetch alternatives for variable length instruction encoding on transport triggered architecture, a static multiple-issue exposed data path architecture. With applications from the CHStone benchmark suite, the compression approach reaches an average compression ratio of 44% at best. We show that the variable length fetch designs reduce the number of memory accesses and often allow the use of a smaller memory component. The proposed compression scheme reduced the energy consumption of synthesized benchmark processors by 15% and area by 33% on average. 相似文献

17.

一种微控制器中零开销循环的实现方法

下载免费PDF全文

薛超凡张盛兵《计算机工程》2012,38(9):244-247

为扩展芯片的应用领域,增强芯片DSP的能力,提出一种用于MCU处理器支持零开销循环的设计方法。该方法依据在DSP程序中经常出现循环的特点,设计专门的硬件处理循环,用以消除循环转移造成的流水线等待,在分析MCU原有结构特别是指令单元的基础上,对循环指令采取与其他分支指令不同的处理方法。在尽量少改动原有MCU结构的前提下,支持零开销的循环。性能分析结果表明,改进后的MCU能有效减少循环执行周期。相似文献

18.

银河TS-1微处理器的流水线 总被引：1，自引：0，他引：1

赵学秘陆洪毅王蕾戴葵王志英《计算机工程》2003,29(5):142-143,F003

银河TS－1微处理器是国防科技大学计算机学院自行设计的具有自主版权的32位嵌入式微处理器，参考标准DLX5级流水线设计了银河TS－1流水线核基本的指令处理通路和数据通路，并以此为基础提出了一种更为高效的6级流水线：取指，译码，操作数准备，ALU执行，数据获取，写回。此6级流水线与5级流水线相比，硬件开销增加很少，但加速比小于1．54。相似文献

19.

A section cache system designed for VLIW architectures

《Journal of Systems Architecture》2000,46(14):1293-1308

The static specification of operations executed in parallel using No Operations (NOPs) is another culprit to make code size to be increased in VLIW architecture. Some alternatives in the instruction encoding and memory subsystem are proposed to minimize the impact of NOP on the code size. One is the compressed cache using the packed encoding scheme and the other is the decompressed cache using the unpacked encoding scheme. The compressed cache shows high memory utilization but increases the pipeline branch penalty because it requires very complex fetch hardware. On the contrary, the fetch overhead can be decreased in the decompressed cache because the unpacked encoding scheme allows an instruction to be issued to the pipeline without any recovery process. However, it has a shortcoming that the memory utilization is deteriorated due to the memory allocation irrespective of the number of useful operations. In this research, a new instruction encoding scheme called a semi-packed encoding scheme and the section cache, which enables effective store and retrieval of semi-packed instructions, are proposed. This can decrease the hardware complexity to fetch an instruction and the wasted memory space due to NOPs via the partially fixed length of an instruction. The experimental results reveal that the memory utilization in the section cache is 3.4 times higher than in the decompressed cache. The memory subsystem using the section cache can provide about 15% performance improvement with the moderate size of chip area. 相似文献

20.

A code compression system based on pipelined interpreters

Jan Hoogerbrugge Lex Augusteijn Jeroen Trum Rik van de Wiel 《Software》1999,29(11):1005-1023

This paper describes a system for compressed code generation. The code of applications is partioned into time‐critical and non‐time‐critical code. Critical code is compiled to native code, and non‐critical code is compiled to a very dense virtual instruction set which is executed on a highly optimized interpreter. The system employs dictionary‐based compression by means of superinstructions which correspond to patterns of frequently used base instructions. The code compression system is designed for the Philips TriMedia VLIW processor. The interpreter is pipelined to achieve a high interpretation speed. The pipeline consists of three stages: fetch, decode, and execute. While one instruction is being executed, the next instruction is decoded, and the next one after that is fetched from memory. On a TriMedia VLIW with a load latency of three cycles and a jump latency of four cycles, the interpreter achieves a peak performance of four cycles per instruction and a sustained performance of 6.27 cycles per instruction. Experiments are described that demonstrate the compression quality of the system and the execution speed of the pipelined interpreter; these were found to be about five times more compact than native TriMedia code and a slowdown of about eight times, respectively. Copyright © 1999 John Wiley & Sons, Ltd. 相似文献