共查询到20条相似文献,搜索用时 203 毫秒
1.
在信息安全领域中,公钥密码算法具有广泛的应用.模乘、模加(减)为公钥密码算法的关键操作,出于性能上的考虑,往往以协处理器的方式来实现这些操作.针对公钥密码算法的运算特点,本文提出了一种可扩展公钥密码协处理器体系结构以及软硬件协同流水工作方式,并且改进了模加(减)操作的实现方法,可以有效支持公钥密码算法.同时,该协处理器体系结构也可根据不同的硬件复杂度及性能设计折衷要求,进行灵活扩展. 相似文献
2.
为更好地在资源有限终端实现SM4密码算法,论文基于开源RISC-V指令集及VexRiscv处理器,设计实现SM4算法扩展指令集,包括两条SM4算法扩展指令分别对应SM4算法密钥扩展部分及密码算法部分,以低硬件资源开销换取基于软件实现SM4密码算法时更高的吞吐量.论文设计实现的SM4密码算法扩展指令,通过使用Xilinx... 相似文献
3.
在深入分析蓝牙芯片内部结构的基础上,结合蓝牙芯片工作特点,设计了基于数字信号处理器(DSP)协处理器密码算法指令并行结构模型和算法工作的过程。该模型综合考虑算法存储空间和时间开销两方面的性能,将计算量大、复杂度高的密码算法利用DSP实现。实验结果表明,该方法可以减小密码算法对蓝牙传输性能的影响,解决了蓝牙单芯片实现复杂算法的问题。 相似文献
4.
国家商用密码标准SM2是以椭圆曲线密码学为基础的公钥密码体制,在软件实现的过程中可能面临敏感数据侧信道泄露的风险.为了提高SM2算法在实际应用中的安全性,针对基于多精度整数和有理算术C语言库(MIRACL)的SM2软件实现,利用缓存计时攻击方法进行了分析.提出监测地址的选取策略,尽可能避免因缓存块大小、时间精度以及数据预取技术带来的误差,并根据泄露点提出改进的固定时长防护方案.实验表明,在以同样方式实施的缓存计时攻击条件下,固定时长的标量乘函数比MIRACL库提供的标量乘函数能够更好地保护SM2中的敏感数据.说明基于MIRACL函数库实现的SM2算法需要采取必要的防护手段,才能具备抵御缓存计时攻击的安全性. 相似文献
5.
国密SM2密码算法已经成为保障我国网络信息系统安全自主可控的关键技术.然而近期研究发现, SM2加密算法在实际部署应用时面临高效的算法替换攻击.该种攻击可以从当前的密文预测下一次加密所使用的随机数,从而可以在不知道解密密钥的情况下成功解密后续密文.密码逆向防火墙技术已被证实可以有效抵抗该种攻击,但其要求密文具有可重随机性,与SM2加密算法本身所具备的CCA (chosen-ciphertext attack)安全性相冲突.针对该问题,本文改进SM2加密算法,构造了具有RCCA (可重放CCA)安全性的公钥加密方案.该方案具有与SM2加密算法近似的安全性,且同时支持密文重随机操作,因此可以有效兼容密码逆向防火墙.方案的设计遵循Phan等提出的OAEP三轮构造范式,结合SM2加密算法进行改进,并在随机预言机模型下给出了严谨的安全证明.本文提出了首个基于国密算法的可重随机RCCA公钥加密方案,研究结果有助于提升SM2密码算法在实际应用中的安全性. 相似文献
6.
在分析椭圆曲线密码体制的基础上,给出了椭圆曲线密码体制基本运算单元的硬件设计方案,基于FPGA实现了一种GF(2m)上椭圆曲线密码协处理器.采用双端口RAM技术完成了协处理器与微控制器的挂接,并且根据微控制器不同的指令调度,协处理器能够完成椭圆曲线密码体制5种基本运算操作.实现结果表明,该协处理器能够适应160≤m≤400范围内任意有限域的选取,能较好地满足数字签名和数据加解密中的应用要求. 相似文献
7.
8.
9.
会话密钥(Session Secret Key,SSK)可在远程实现各方之间的安全通信,在实际的开放网络部署中具有重要地位。传统SSK主要是基于公钥基础设施的认证密钥交换(Authenticated Key Exchange,AKE)协议构建的,因涉及证书的颁发、更新、撤销等繁琐操作,面临昂贵的计算、通信和存储开销。虽然基于标识(Identity,ID)的AKE (ID-AKE)协议可解决这个问题,但目前的大部分ID-AKE协议均基于国外密码算法设计,尚未见基于国产商用密码算法的ID-AKE协议在国内外刊物上正式发表,不符合我国密码核心技术自主可控的要求。SM2认证密钥交换(Authenticated Key Exchange From SM2,SM2-AKE)协议因具有高安全和高效率的特性,在商用密码中得到广泛应用。但证书管理开销问题仍未被解决,这将极大限制了SM2-AKE协议的应用与推广。文章于标识密码(Identity-based Cryptography,IBC)体系下采用类Schnorr签名密钥生成方法,基于SM2设计了一种标识认证密钥交换(SM2-ID-AKE)协议,并在CDH安全假设和随机谕言模型下证明了该协议的安全性。最后的理论分析和仿真实验结果表明,与现有的ID-AKE协议相比,文章协议至少节省66.67%的通信带宽和34.05%的计算开销,有效降低和减轻了系统的代价和负担,更能够适应网络通讯部署等领域下不同用户的安全通信服务需求。 相似文献
10.
王森 《网络安全技术与应用》2022,(10):22-25
随着我国信息安全自主可控要求越来越高,SM2及相应的国产密码算法发挥了重要作用。目前SM2还存在应用不充分、管理不规范等问题。本文介绍了SM2算法的原理,详解了SM2应用实践。在身份鉴别应用中,利用SM2数字证书实现了用户和服务器的双向安全鉴别;在电子印章应用中,利用电子签章实现了电子文件权威认证;在可信计算中,利用可信根签名和信任链实现了可信系统部件状态的可信认证。 相似文献
11.
12.
13.
Multiple-instruction-issue processors seek to improve performance over scalar RISC processors by providing multiple pipelined functional units in order to fetch, decode and execute several instructions per cycle. The process of identifying instructions which can be executed in parallel and distributing them between the available functional units is referred to as instruction scheduling. This paper describes a simple compile-time scheduling technique, called conditional compaction, which uses the concept of conditional execution to move instructions across basic block boundaries. It then presents the results of an investigation into the performance of the scheduling technique using C benchmark programs scheduled for machines with different functional unit configurations. This paper represents the culmination of our investigation into how much performance improvement can be obtained using conditional execution as the sole scheduling technique. 相似文献
14.
在同时多线程处理器中,提高取指单元的吞吐率意味着各线程之间的Cache竞争更加激烈,而这种竞争又制约着取指单元吞吐率的提高。本文针对当前超长指令字体系结构的新特点,提出了一种同时提高取指单元和处理器吞吐率的方法。该方法通过尽可能早地作废取指流水线中的无效地址,减少了由无效取指导致的程序Cache冲突,也提高了整个处理器的性能。实验结果表明,该方法使处理器和取指单元的吞吐率均相对提高了12%~23%,而一级程序Cache的失效率则略微增加甚至降低。另外,它还能够减少10%~25%的一级程
程序Cache读访问,从而降低了处理器的功耗。 相似文献
程序Cache读访问,从而降低了处理器的功耗。 相似文献
15.
数字信号处理软件中循环程序在执行时间上占有很大比例,用指令缓冲器暂存循环代码可以减少程序存储器的访问次数,提高处理器性能。在VLIW处理器指令流水线中增加一个支持循环指令的缓冲器,该缓冲器能够缓存循环程序指令,并以软件流水的形式向功能部件派发循环程序指令。这样循环程序代码只需访存一次而执行多次,大大减少了访存次数。在循环指令运行期间,缓冲器发出信号使程序存储器进入睡眠状态可以降低处理器功耗。典型的应用程序测试表明,使用了循环缓冲后,取指流水线空闲率可达90%以上,处理器整体性能提高10%左右,而循环缓冲的硬件面积开销大约占取指流水线的9%。 相似文献
16.
Timo Viitanen Janne Helkala Heikki Kultala Pekka Jääskeläinen Jarmo Takala Tommi Zetterman Heikki Berg 《International journal of parallel programming》2018,46(6):1283-1303
The memories used for embedded microprocessor devices consume a large portion of the system’s power. The power dissipation of the instruction memory can be reduced by using code compression methods, which may require the use of variable length instruction formats in the processor. The power-efficient design of variable length instruction fetch and decode is challenging for static multiple-issue processors, which aim for low power consumption on embedded platforms. The memory-side power savings using compression are easily lost on inefficient fetch unit design. We propose an implementation for instruction template-based compression and two instruction fetch alternatives for variable length instruction encoding on transport triggered architecture, a static multiple-issue exposed data path architecture. With applications from the CHStone benchmark suite, the compression approach reaches an average compression ratio of 44% at best. We show that the variable length fetch designs reduce the number of memory accesses and often allow the use of a smaller memory component. The proposed compression scheme reduced the energy consumption of synthesized benchmark processors by 15% and area by 33% on average. 相似文献
17.
18.
19.
《Journal of Systems Architecture》2000,46(14):1293-1308
The static specification of operations executed in parallel using No Operations (NOPs) is another culprit to make code size to be increased in VLIW architecture. Some alternatives in the instruction encoding and memory subsystem are proposed to minimize the impact of NOP on the code size. One is the compressed cache using the packed encoding scheme and the other is the decompressed cache using the unpacked encoding scheme. The compressed cache shows high memory utilization but increases the pipeline branch penalty because it requires very complex fetch hardware. On the contrary, the fetch overhead can be decreased in the decompressed cache because the unpacked encoding scheme allows an instruction to be issued to the pipeline without any recovery process. However, it has a shortcoming that the memory utilization is deteriorated due to the memory allocation irrespective of the number of useful operations. In this research, a new instruction encoding scheme called a semi-packed encoding scheme and the section cache, which enables effective store and retrieval of semi-packed instructions, are proposed. This can decrease the hardware complexity to fetch an instruction and the wasted memory space due to NOPs via the partially fixed length of an instruction. The experimental results reveal that the memory utilization in the section cache is 3.4 times higher than in the decompressed cache. The memory subsystem using the section cache can provide about 15% performance improvement with the moderate size of chip area. 相似文献
20.
This paper describes a system for compressed code generation. The code of applications is partioned into time‐critical and non‐time‐critical code. Critical code is compiled to native code, and non‐critical code is compiled to a very dense virtual instruction set which is executed on a highly optimized interpreter. The system employs dictionary‐based compression by means of superinstructions which correspond to patterns of frequently used base instructions. The code compression system is designed for the Philips TriMedia VLIW processor. The interpreter is pipelined to achieve a high interpretation speed. The pipeline consists of three stages: fetch, decode, and execute. While one instruction is being executed, the next instruction is decoded, and the next one after that is fetched from memory. On a TriMedia VLIW with a load latency of three cycles and a jump latency of four cycles, the interpreter achieves a peak performance of four cycles per instruction and a sustained performance of 6.27 cycles per instruction. Experiments are described that demonstrate the compression quality of the system and the execution speed of the pipelined interpreter; these were found to be about five times more compact than native TriMedia code and a slowdown of about eight times, respectively. Copyright © 1999 John Wiley & Sons, Ltd. 相似文献