期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

秦宗庆张伯泉曹海波《微电子学与计算机》2014,(5):140-143

针对移动硬盘数据安全问题,分析、优化了AES加密算法,提出了Microbalze与轮内、轮间三级流水线AES加解密IP核结合的架构,设计并实现了一种介于硬盘与电脑USB接口之间的加解密安全卡.仅需在电脑USB与硬盘间串联该安全卡,普通硬盘便可以升级为加密硬盘.在Spartan6-Nexys3FPGA开发板上实验结果表明,该加解密系统能在120MHz时钟下达到174.08Mb/s的吞吐率,系统吞吐率高、资源消耗低. 相似文献

2.

基于轮内流水线技术的高性能AES硬件实现设计

郑行王静王云峰《中国集成电路》2014,(6):55-62

为了提升AES的性能,本文采用轮内流水线技术进行AES硬件设计。在对AES轮单元复杂的字节代换/逆字节代换、列变换/逆列变换进行了算法分析的基础上,进行了AES轮单元的轮内7级流水线设计。特别是采用常数矩阵乘积形式和复用列变换进行了逆列变换设计,降低了硬件资源的占用。采用Xilinx ISE10.1工具进行了各个型号FPGA的硬件实现,实验数据表明文中提出的硬件实现方案提升了AES的数据吞吐率与吞吐率/面积比。相似文献

3.

基于RISC-V直接访存的SM4加密单元的设计

下载免费PDF全文

王堃夏宏《移动信息》2023,45(6):245-249

为适应信息安全对网络加密数据吞吐率日益增长的要求，基于我国自主设计的首个商用加密算法SM4，本文在开源的RISC-V处理器中，设计了一个具有直接访存功能的SM4加脱密单元，并对RISC-V的指令集进行了扩展，扩展的指令可直接调用SM4单元。这种方法不仅通过硬件实现了SM4加脱密算法，同时有效减少了SM4单元在加解密过程中使用取数和存数指令访存的频率，大幅度提高了数据加密的速度。为了解决CPU访存与SM4单元访存的冲突，设计中采用了流水线互锁方案，并使用Modelsim进行了仿真验证。在300MHz的时钟频率下，加解密4kB数据需要10500 个时钟周期，吞吐率达到了914.28Mbit/s。相似文献

4.

抗差分功耗分析和差分故障分析的AES算法VLSI设计与实现

韩军曾晓洋赵佳《通信学报》2010,31(1):20-29

提出了一种抗差分功耗分析和差分故障分析的AES算法硬件设计与实现方案,该设计主要采用了数据屏蔽和二维奇偶校验方法相结合的防御措施.在保证硬件安全性的前提下,采用将128bit运算分成4次32bit运算、模块复用、优化运算次序等方法降低了硬件实现成本,同时使用3级流水线结构提高了硬件实现的速度和吞吐率.基于以上技术设计的AES IP核不仅具有抗双重旁道攻击的能力,而且拥有合理的硬件成本和运算性能. 相似文献

5.

一个可重利用、低功耗RISC CPU IP核的设计

楼向雄骆建军程思琪《固体电子学研究与进展》2004,24(4):450-454

研究设计了一个可重利用、低功耗的精简指令计算机 (RISC)中央处理器的知识产权 (IntellectualProper ty)核。该RISCCPUIP核采用单时钟周期、两级流水线、哈佛总线结构。在相同处理速度下 ,其功耗降低至传统PICCPU功耗的约 1/ 4。设计的IP核用台湾联华电子 (UMC) 0 .2 5微米CMOS工艺实现 ,测试结果验证了文中的理论成果 ,并成功地实现了该IP核的工业化应用。相似文献

6.

一种16位定点式DSP核的设计及Modelsim仿真验证

辛晓宁李萌《微电子学与计算机》2014,(6):180-183,188

为提高DSP的工作效率,设计了一种4级流水线的16位定点式DSP核.分别从系统及关键模块设计两个方面,介绍了DSP核的具体设计方法,着重分析了流水线的实现方案及DSP核的指令流与数据流,给出了DSP核的完整设计方案.最后给出DSP核支持的指令集,并基于Modelsim仿真环境对指令集进行验证.结果表明,该DSP核能够正确执行各条指令,最高时钟频率为12.5MHz,可在单个机器周期内完成高速运算. 相似文献

7.

SM3杂凑算法的ASIC设计和实现

张倩李树国《微电子学与计算机》2014,(9)

针对国家商用密码SM3杂凑算法提出了一种四合一的ASIC实现架构.该架构采用进位保留加法器和循环展开方式,与单轮结构相比,时钟周期数减少了75%,吞吐率提高了29.4%.采用65nm的SMIC工艺,在125MHz的低时钟频率下,吞吐率达到了4Gb/s.此款SM3杂凑算法芯片已经进行了流片,支持填充和暂停功能. 相似文献

8.

JPEG解码器IP核的设计与实现

张志晓何明华《电子科技》2011,24(4):59-63

介绍了基于静止图像压缩标准JPEG解码器IP核的设计与实现.设计采用适于硬件实现的IDCT算法结构,通过增加运算并行度和流水线技术相结合的方法以提高处理速度.根据Huffman码流特点,采用新的Huffman并行解码硬件实现结构,用简单的算术运算代替复杂的配对模式,解码速度快,硬件成本低.该IP核可方便地集成到诸如数码... 相似文献

9.

LPCC浮点运算IP核的设计与实现

李倩侯义斌黄樟钦何东之王晋嘉赵丽娜高曦《微电子学与计算机》2008,25(1):85-88,92

介绍了线性预测倒谱系数(Linear Prediction Cepstrum Coefficient,LPCC)提取算法,给出该算法的一种浮点IP核实现模型,并详细描述了各个子模块的设计方法。以VHDL作为设计语言,在ISE、ModelSim软件下完成综合和仿真,并在Xilinx Spartan-3 FPGA目标板上实现设计。采用关键路径流水线实现、资源共享等技术进行优化。该IP核计算结果精度高,运算时间短,已经成功应用在嵌入式语音识别系统中。相似文献

10.

BP神经网络图像压缩算法乘累加单元的FPGA设计

杨隽周诠张敏瑞《现代电子技术》2009,32(19):38-41

提出一种基于三层前馈BP神经网络实现图像压缩算法的方案,该方案采用可重载IP核和VHDL代码相结合的设计方式.对方案中重要单元一束累加单元进行了FPGA设计,该模块设计采用流水线处理方式,增大了数据吞吐量,减小了系统延时,提高了时钟频率,并完成了该单元的行为级功能仿真.仿真结果验证了FPGA设计的可行性. 相似文献

11.

NOC路由节点VLSI设计 总被引：2，自引：2，他引：0

王剑王宏杨志家《微电子学与计算机》2010,27(1)

基于wormhole交换策略和目的地址确定性路由算法,采用三级流水线的结构实现了片上网络中的路由节点.该路由节点适用于Mesh和Torus拓扑,并采用虚通道技术增加吞吐量.在Xilinx的FPGA上实现后可知,该路由节点最高可工作在130MHz的时钟频率上,传输带宽为20.8Gb/s. 相似文献

12.

JPEG2000算术编码器的算法优化和VLSI设计 总被引：1，自引：1，他引：0

下载免费PDF全文

刘文松朱恩王健徐龙涛林叶《电子学报》2011,39(11):2486-2491

研究了JPEG2000算术编码器的算法和电路实现.提出了重归一化规程的一种新的顺序结构,通过添加独立的总移位次数预测规程,使得编码算法可以一次性顺序完成当前上下文的处理.据此设计了具有从流水线的三级流水线电路结构,流水线用于处理无编码字节输出的常规情况,从流水线单独处理编码字节的输出,从而有效缩短了各级电路的关键路径延... 相似文献

13.

A high-throughput low-cost AES processor 总被引：5，自引：0，他引：5

Chih-Pin Su Tsung-Fu Lin Chih-Tsiun Huang Cheng-Wen Wu 《Communications Magazine, IEEE》2003,41(12):86-91

We propose an efficient hardware implementation of the advanced encryption standard algorithm, with key expansion capability. Compared to the widely used table lookup technique, the proposed basis transformation technique reduces the hardware overhead of the S-box by 64 percent. Our pipelined design has a very high throughput rate. Using typical 0.35 /spl mu/m CMOS technology, a 200 MHz clock is easily achieved, and the throughput rate in the non-feedback cipher mode is 2.38 Gb/s for 128-bit keys, 2.008 Gb/s for 192-bit keys, and 1.74 Gb/s for 256-bit keys, respectively. Testability of the design is also considered. The hardware cost of the AES design is approximately 58 K gates using a standard synthesis flow. 相似文献

14.

高速RS(204,188)译码器的FPGA实现

许林峰《电讯技术》2007,47(4):152-155

介绍了数字电视广播中广泛采用的RS(204,188)译码器原理和FPGA实现方案,采用并行的三级流水线结构以提高速度,并根据Berlekamp-Massey(BM)算法对译码器进行了优化设计,减少了硬件消耗.译码器的最大时钟频率可以达到75MHz.译码器的性能仿真和FPGA实现验证了该方案的可行性. 相似文献

15.

A 16-bit cascaded sigma-delta pipeline A/D converter

李梁李儒章俞宙张加斌张俊安《半导体学报》2009,30(5):103-108

A low-noise cascaded multi-bit sigma-delta pipeline analog-to-digital converter （ADC） with a low over-sampling rate is presented. The architecture is composed of a 2-order 5-bit sigma-delta modulator and a cascaded 4-stage 12-bit pipelined ADC, and operates at a low 8X oversampling rate. The static and dynamic performances of the whole ADC can be improved by using dynamic element matching technique. The ADC operates at a 4 MHz clock rate and dissipates 300 mW at a 5 V/3 V analog/digital power supply. It is developed in a 0.35μm CMOS process and achieves an SNR of 82 dB. 相似文献

16.

A 16-bit cascaded sigma-delta pipeline A/D converter

Li Liang Li Ruzhang Yu Zhou Zhang Jiabin Zhang Jun'an 《半导体学报》2009,30(5)

A low-noise cascaded multi-bit sigma-delta pipeline analog-to-digital converter (ADC) with a low oversampling rate is presented. The architecture is composed of a 2-order 5-bit sigma-delta modulator and a cascaded 4-stage 12-bit pipelined ADC, and operates at a low 8X oversampling rate, The static and dynamic performances of the whole ADC can be improved by using dynamic element matching technique. The ADC operates at a 4 MHz clock rate and dissipates 300 mW at a 5 V/3 V analog/digital power supply. It is developed in a 0.35 μm CMOS process and achieves an SNR of 82 dB. 相似文献

17.

A 965-Mb/s 1.0-μm standard CMOS twin-pipe serial/parallelmultiplier

Larsson-Edefors P. 《Solid-State Circuits, IEEE Journal of》1996,31(2):230-239

This paper presents a two's complement high-speed twin-pipe serial/parallel multiplier architecture which produces y=cd, where c is the parallel coefficient and d is the serial data. The multiplier is based on the twin pipeline (twin-pipe) concept, in which two data bits are processed each clock cycle. The high serial data throughput rate is mainly due to the use of: 1) a novel twin-pipe architecture, 2) new twin-pipe adder types, and 3) a new multiplier circuit structure. A 4-bit high-speed twin-pipe serial/parallel multiplier, on an active area of 0.224 mm², has been designed and fabricated in a 1.0-μm N-well double-metal single-poly CMOS process. Testing of the multiplier shows that the maximal serial data throughput rate is 965 Mb/s at V_dd=5 V 相似文献

18.

基于FPGA的可配置 FFT IP核实现研究

李大习《电子科技》2014,27(6):46-49,53

针对FFT算法基于FPGA实现可配置的IP核。采用基于流水线结构和快速并行算法实现了蝶形运算和4 k点FFT的输入点数、数据位宽、分解基自由配置。使用Verilog 语言编写,利用ModelSim仿真,由ISE综合并下载,在Xilinx公司的Virtex-5 xc5vfx70t器件上以200 MHz 的时钟实现验证,运算结果与其他设计的运算效率对比有一定优势 相似文献

19.

A High Performance Early Acknowledged Asynchronous Pipeline using Hybrid-logic Encoding

《Integration, the VLSI Journal》2020

This paper details a novel asynchronous pipelining methodology that maximizes the throughput buffering capacity and robustness of gate-level pipelined systems. The data paths in the proposed pipeline style are encoded using hybrid logic encoding scheme, which incorporates simplicity of the single-rail encoding and robustness of the dual-rail encoding. The control path that provides the synchronization between pipeline stages is constructed based on the simple and high-speed early acknowledgment protocol. Further, the proposed pipeline accommodates isolate phase to achieve 100% storage capacity. Two test cases: A 4-bit,10-stage FIFO and a 16-bit adder, have been designed in 90 nm technology to validate the proposed pipeline style. The FIFO has been laid out in the UMC 180 nm process using the cadence tool suite. The post-layout results of FIFO show 12.5% better throughput than the high capacity single-rail pipeline. Simulation results of the adder also reveal that the proposed structure achieves the throughput of 3.44 Giga-items/sec, which is 44.18% higher than the APCDP (Asynchronous pipeline based on constructed critical path) and 11.9% higher than the high capacity single-rail pipelines. 相似文献

20.

Efficient architectures for two-dimensional discrete wavelet transform using lifting scheme.

Chengyi Xiong Jinwen Tian Jian Liu 《IEEE transactions on image processing》2007,16(3):607-614

Novel architectures for 1-D and 2-D discrete wavelet transform (DWT) by using lifting schemes are presented in this paper. An embedded decimation technique is exploited to optimize the architecture for 1-D DWT, which is designed to receive an input and generate an output with the low- and high-frequency components of original data being available alternately. Based on this 1-D DWT architecture, an efficient line-based architecture for 2-D DWT is further proposed by employing parallel and pipeline techniques, which is mainly composed of two horizontal filter modules and one vertical filter module, working in parallel and pipeline fashion with 100% hardware utilization. This 2-D architecture is called fast architecture (FA) that can perform J levels of decomposition for N * N image in approximately 2N2(1 - 4(-J))/3 internal clock cycles. Moreover, another efficient generic line-based 2-D architecture is proposed by exploiting the parallelism among four subband transforms in lifting-based 2-D DWT, which can perform J levels of decomposition for N * N image in approximately N2(1 - 4(-J))/3 internal clock cycles; hence, it is called high-speed architecture. The throughput rate of the latter is increased by two times when comparing with the former 2-D architecture, but only less additional hardware cost is added. Compared with the works reported in previous literature, the proposed architectures for 2-D DWT are efficient alternatives in tradeoff among hardware cost, throughput rate, output latency and control complexity, etc. 相似文献