首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 343 毫秒
1.
AES算法在实时数据加密中的应用对其处理速度及在FPGA中实现的功耗和成本提出较高要求。针对上述情况,介绍一种基于小型FPGA的快速AES算法的改进方法,通过微处理器完成AES算法中的密钥扩展运算,同时采用共享技术实现加密和解密模块共享同一密钥。实验结果表明,该方法可有效提高处理速度,节省FPGA资源,降低芯片功耗。  相似文献   

2.
Field-programmable gate arrays (FPGAs) have travelled far from just being utilized as glue logic to an entire system solution. This is mostly due to their generalized re-configurable nature, lower non-recurring engineering (NRE) expense, and also fast time to market. Owing to the reconfigurable nature of FPGA, a new field called reconfigurable computing that can change the circuit configuration after hardware production came into existence. Application of re-configurable computing for self-adaptive hardware allows hardware to get adapt to various environmental conditions and different needs by swapping or loading disparate computational modules. This work proposes an effectual design methodology (enhanced DPR security system (EDPRSS)) utilized to execute high performance FPGA device in respect of low power consumption along with security for the area reduction. In the proposed technique, hash code generation (HCG) and encryption hardware accelerators can well be dynamically produced on FPGA utilizing partial re-configuration as stated by the application requisites. The system is competent to swap in or swap out the equivalent hardware accelerator during run time, which in turn diminishes the power and area. Here, 2 re-configurable partitions are produced for encryption and also HCG algorithm. Experiential outcomes proved that the proposed technique proffers better performance when contrasted to the other conventional systems.  相似文献   

3.
吴健凤  郑博文  聂一  柴志雷 《计算机工程》2021,47(12):147-155,162
在数字货币、区块链、云端数据加密等领域,传统以软件方式运行的数据加解密存在计算速度慢、占用主机资源、功耗高等问题,而以Verilog/VHDL等方式实现的现场可编程门阵列(FPGA)加解密系统又存在开发周期长、维护升级困难等问题。针对3DES算法,提出一种基于OpenCL的FPGA加速器设计方案。设计具有48轮迭代的流水并行结构,在数据传输模块中采用数据存储调整、数据位宽改进策略提高内核实际带宽利用率,在算法加密模块中采用指令流优化策略形成流水线并行架构,同时采用内核矢量化、计算单元复制策略进一步提高内核性能。实验结果表明,该加速器在Intel Stratix 10 GX2800上可获得111.801 Gb/s的吞吐率,与Intel Core i7-9700 CPU相比性能提升372倍,能效提升644倍,与NvidiaGeForce GTX 1080Ti GPU相比性能提升20%,能效提升9倍。  相似文献   

4.
This work presents permutation and diffusion based hybrid image crypto system in transform domain using combined chaotic maps and Haar Integer Wavelet Transform (HIWT). HIWT is used to transform the plain image and four sub-bands of the image coefficients are encrypted by combined chaotic maps. The combination of two one-dimensional chaotic maps results in better chaotic behavior and generates unpredictable large random sequence that can be used for the encryption of the image. To manage the trade-offs between security, speed and power consumption, the proposed encryption algorithm is modeled in Cyclone II Field Programmable Gate Array (FPGA). The proposed design occupies only 4025 logical elements and takes 0.28 ms for encrypting an image of size 256 × 256. Robustness of the algorithm is estimated using quality metrics including statistical and differential attack analysis. The proposed scheme is resistant to most of the known attacks and is more secure than other image encryption schemes.  相似文献   

5.
There has been an increasing concern for the security of multimedia transactions over real-time embedded systems. Partial and selective encryption schemes have been proposed in the research literature, but these schemes significantly increase the computation cost leading to tradeoffs in system latency, throughput, hardware requirements and power usage. In this paper, we propose a light-weight multimedia encryption strategy based on a modified discrete wavelet transform (DWT) which we refer to as the secure wavelet transform (SWT). The SWT provides joint multimedia encryption and compression by two modifications over the traditional DWT implementations: (a) parameterized construction of the DWT and (b) subband re-orientation for the wavelet decomposition. The SWT has rational coefficients which allow us to build a high throughput hardware implementation on fixed point arithmetic. We obtain a zero-overhead implementation on custom hardware. Furthermore, a Look-up table based reconfigurable implementation allows us to allocate the encryption key to the hardware at run-time. Direct implementation on Xilinx Virtex FPGA gave a clock frequency of 60 MHz while a reconfigurable multiplier based design gave a improved clock frequency of 114 MHz. The pipelined implementation of the SWT achieved a clock frequency of 240 MHz on a Xilinx Virtex-4 FPGA and met the timing constraint of 500 MHz on a standard cell realization using 45 nm CMOS technology.  相似文献   

6.
针对现场可编程门阵列(FPGA)远程升级的需求,介绍了几种可实现的适应远程升级的FPGA配置方法。通过对器件配置原理的分析比较,结合工程实践,重点提出了两种基于主动串行(AS)模式的配置新方法:通用型远程升级配置方法和新型远程升级配置方法。这两种配置方法对于单板调试没有影响,易于实现远程升级而不增加任何分离器件,对于成本和功耗的降低也有较明显的优势。  相似文献   

7.
针对目前无线人体局域网(wireless body area network, WBAN)安全方案存在复杂度高、功耗大、实用性差等缺陷,提出了一种满足WBAN高安全性、低功耗需求的组合混沌流加密方案。该算法包括三种量化精度,首先通过tent映射对logistic映射的轨道进行干扰产生混沌序列,然后结合均衡性和自相关性良好的m序列生成密钥流,最后与明文进行异或运算生成密文。使用Verilog硬件描述语言对该算法进行建模,在现场可编程门阵列(field programmable gate array, FPGA)上实现了板级验证。通过标准灰度图像进行安全性测试,密图信息熵达到7.999 4,相邻像素相关系数接近0。结果表明,相较于现有算法,该算法密文图像相关性更好、信息熵更高。  相似文献   

8.
现场可编程门阵列(FPGA)在计算机视觉应用领域有着广阔的前景,然而FPGA有限的片上存储器资源难以满足应用场景下性能、尺寸和功率的需求。针对这个问题,研究片上存储器的资源分配,在最小化片上资源使用和整体功耗的前提下提出一种易于实现的分区平衡算法。实验结果表明,与商用FPGA高级综合工具相比,本文算法的利用率提高达60%,且动态功耗降低了约70%。在高级算法MeanShift跟踪的实验中,实验结果显示,分区算法可以在不影响关键性能的前提下降低总功耗高达30%。  相似文献   

9.
Authenticated encryption schemes provide both confidentiality and integrity services, simultaneously. CAESAR competition will identify a portfolio of authenticated ciphers, which is expected to be suitable for widespread adoption and offers advantages over AES-GCM. An important criterion for selecting the final candidates, besides security, is the hardware performance in resource-limited environments. In this paper, SILC, CLOC, AES-JAMBU, and COLM authenticated ciphers have been selected from the third round of the CAESAR competition for hardware evaluation. The main reasons to choose these schemes are their lightweight design, sufficient security level, and the use of the AES algorithm as their underlying block cipher. To the best our knowledge, it is the first time that an 8-bit lightweight architecture which is compatible with API v2 is presented for the selected schemes. To implement AES, the Atomic-AES v2 which is one of the smallest implementations has been adopted according to the requirements of the selected schemes. Furthermore, to reduce the area in the hardware implementation, several techniques are used, including implementing one AES core in the datapath, sharing registers to store intermediate values, implementing the tweak functions with the shuffling of wires, and implementing doubling on the GF(2128) with 8-bit architecture to construct the higher-order multipliers. The implementation results are presented on ASIC and FPGA platforms. The proposed architecture for each scheme on the two platforms is similar, but different optimization techniques are used for each platform, e.g. the AES S-box is implemented as ROM-based and logic-based on FPGA and ASIC, respectively. The comparing of the results with 128-bit implementations shows that the area on FPGA and ASIC is reduced up to 65% and 88%, respectively. The results of the current study demonstrate that AES-JAMBU has the lowest hardware area and the highest throughput and performance on both platforms. Besides, CLOC has the highest area reduction on both platforms, compared with those of the 128-bit implementations.  相似文献   

10.
马绪健  刘姝  高铭泽  董秀则 《计算机应用研究》2023,40(6):1825-1828+1844
GIFT算法作为PRESENT算法的改进版本,结构上更加简洁高效,在FPGA上运行时,性能仍然存在提升空间。对此提出了一种新的实现方案,通过将算法的40轮迭代计算优化为20轮迭,并将加解密与轮密钥生成操作并行执行。在xc6slx16 FPGA平台综合后,频率可达194 MHz,吞吐量可达1.2 Gbps,消耗时钟周期21个,结果表明,所提方法相比现有工作具有更好的性能表现和更少的时钟周期消耗,实现在FPGA上高速运行是切实可行的。  相似文献   

11.
具有优越性能的卷积神经网络算法已得到广泛应用,但其参数量大、计算复杂、层间独立性高等特点也使其难以高效地部署在较低功耗和较少资源的边缘场景.为此结合该种算法的特点提出了一种基于混合架构的卷积神经网络计算加速方法,该方法选用CPU加FPGA的混合架构,对网络模型进行了压缩优化;在FPGA上通过指令控制数据流的DSP阵列结...  相似文献   

12.
Early estimation of application-specific power consumption has become one of the major constraints of modern ASIC design. While in early stages of the design process precise power consumption can only be obtained from very time consuming gate-level (GTL) simulation, power estimation methodologies aim to reduce computational overhead by deriving models to approximate power consumption on higher levels. This work presents an FPGA accelerated power estimation methodology for programmable processors based on a hybrid functional level (FLPA) and instruction level power analysis (ILPA) that can be mapped onto an FPGA together with the functional emulation. It enables fast and accurate estimation of application-specific power consumption and energy per task which is crucial for power-aware design of embedded processor architectures. The approach allows both hardware and software designers to optimize their implementations not only for processing performance but also for power efficiency. The power emulation methodology and considerations for the FPGA implementation of the power estimation is described in detail. Model validation against GTL power simulation and results are given for a typical embedded RISC processor and a commercial-grade Application Specific Instruction Set Processor (ASIP). Power consumption models yield fast and accurate power estimation with a %MAE of less than 9% and NRMSE of less than 7% enabling co-optimization of both hardware and software with respect to power consumption in early design stages.  相似文献   

13.
针对FPGA能较好满足高性能计算的异构多核、并行、低成本、低能耗要求,研究了高性能计算的重要的应用之一——多体问题。分析了多体问题应用广泛的FMM算法以及FMM算法的各个算粒,并在FPGA器件实现算粒,与多核CPU上实现这些算粒进行比较,FPGA都获得了不错的加速比。分析了FPGA应用高性能计算的一些优势和当前面临的问题,对FPGA广泛应用高性能计算进行了初步探索。  相似文献   

14.
针对网络安全加密系统中安全能力弱、开发成本高和实时能力差等问题,提出了一种基于FPGA的可重构加密引擎的设计方案,在详细论述了该加密引擎的总体设计结构的基础上,分析了FPGA实现中关键技术的解决方法。通过实验仿真表明:该引擎可以有效地提高FPGA器件的可重构性能,可重构资源比可以达到0.78,因此,该引擎在今后的嵌入式安全产品开发方面具有很好的速度和可重构应用前景。  相似文献   

15.
Today the ICT industry accounts for 2–4% of the worldwide carbon emissions that are estimated to double in a business-as-usual scenario by 2020. A remarkable part of the large energy volume consumed in the Internet today is due to the over-provisioning of network resources such as routers, switches and links to meet the stringent requirements on reliability. Therefore, performance and energy issues are important factors in designing gigabit routers for future networks. However, the design and prototyping of energy-efficient routers is challenging because of multiple reasons, such as the lack of power measurements from live networks and a good understanding of how the energy consumption varies under different traffic loads and switch/router configuration settings. Moreover, the exact energy saving level gained by adopting different energy-efficient techniques in different hardware prototypes is often poorly known. In this article, we first propose a measurement framework that is able to quantify and profile the detailed energy consumption of sub-components in the NetFPGA OpenFlow switch. We then propose a new power-scaling algorithm that can adapt the operational clock frequencies as well as the corresponding energy consumption of the FPGA core and the Ethernet ports to the actual traffic load. We propose a new energy profiling method, which allows studying the detailed power performance of network devices. Results show that our energy efficient solution obtains higher level of energy efficiency compared to some existing approaches as the upper and lower bounds of power consumption of the NetFPGA Openflow switch are proved to be 30% lower than ones of the commercial HP Enterprise switch. Moreover, the new switch architecture can save up to 97% of dynamic power consumption of the FPGA chip at lowest frequency mode.  相似文献   

16.
The technique of online/offline is regarded as a promising approach to speed up the computation of encryption, because the most part of computation, such as pairing over points on elliptic curve and exponentiation in groups, can be pre-computed in the offline phase without knowing the message to be encrypted and/or recipient’s identity. The online phase only requires light computation, such as modular multiplication. In this paper, we propose two novel identity-based online/offline schemes: a full secure identity-based online/offline encryption scheme and an identity-based online/offline signcryption scheme. Compared to the other schemes in the literature, our schemes achieve the shortest ciphertext size in both offline and online phases and demonstrate the best performance in offline computation. Our schemes are applicable to devices with limited computation power. They are proven secure in the random oracle model.  相似文献   

17.
Cryptographic primitives are extensively used in today's applications to provide the desired security. Malicious or accidental faults that occur in the hardware implementations of cryptographic primitives, specifically in this paper the Advanced Encryption Standard (AES), can result in an erroneous output of encryption/decryption process and reduce the reliability of the cryptographic hardware. The use of a suitable fault-tolerant scheme for AES, to recover it from failures or attacks and bring it back to an operational state, is crucial for reliability, and consequently for security purposes. In this paper, two novel online fault-tolerant schemes are proposed for AES. In the proposed fault-tolerant architecture, the round path is modified and divided it into two pipeline stages. The proposed fault-tolerant schemes are based on a combination of hardware and time redundancies, where a new hardware redundancy is proposed for the AES round function and a time redundancy for the hardware of the AES key expansion unit. The presented fault-tolerant schemes are valid for all versions of AES and are independent of its S-box implementation manner. Both ASIC and FPGA implementations of the original and the proposed fault-tolerant AES along with Full TMR (Triple Modular Redundancy) and Full TTR (Triple Time Redundancy) structures are reported as traditional fault-tolerant schemes. It is shown that the first proposed fault-tolerant architecture, named TMRrp&TTRke32, outperforms these approaches and the previous report in the literature in terms of area overhead and therefore power consumption. Also, the other approach, named TMRrp&TTRke64, is better than the other approaches in achieving a trade-off between area overhead and throughput overhead.  相似文献   

18.
This paper aims at presenting a new countermeasure against Side-Channel Analysis (SCA) attacks, whose implementation is based on a hardware-software co-design. The hardware architecture consists of a microprocessor, which executes the algorithm using a false key, and a coprocessor that performs several operations that are necessary to retrieve the original text that was encrypted with the real key. The coprocessor hardly affects the power consumption of the device, so that any classical attack based on such power consumption would reveal a false key. Additionally, as the operations carried out by the coprocessor are performed in parallel with the microprocessor, the execution time devoted for encrypting a specific text is not affected by the proposed countermeasure. In order to verify the correctness of our proposal, the system was implemented on a Virtex 5 FPGA. Different SCA attacks were performed on several functions of AES algorithm. Experimental results show in all cases that the system is effectively protected by revealing a false encryption key.  相似文献   

19.
针对传统的基于数字信号处理器(DSP)+现场可编程门阵列(FPGA)的非制冷红外机芯平台存在体积大、功耗大、实时性差、系统集成度低等不足,提出了一种基于单片FPGA的小型化非制冷红外机芯平台设计.针对25μm非制冷红外探测器,为满足小型化、低功耗要求,平台在采用先进的FPGA处理器和DDR3存储器技术的同时,将硬件逻辑算法与NIOS Ⅱ软核相结合,完成对红外探测器的时序驱动、温度控制、图像的非均匀性处理、图像增强以及各种人机接口控制.实验结果表明:该系统成像质量较高,系统功耗小于2W,系统延时小于0.5ms,系统具有较强的可拓展性.  相似文献   

20.
随着量子计算技术的高速发展,传统的公钥密码体制正在遭受破译的威胁,将现有加密技术过渡到具有量子安全的后量子密码方案上是现阶段密码学界的研究热点。在现有的后量子密码(Post-Quantum Cryptography,PQC)方案中,基于格问题的密码方案由于其安全性,易实施性和使用灵活的众多优点,成为了最具潜力的PQC方案。SHA-3作为格密码方案中用于生成伪随机序列以及对关键信息散列的核心算子之一,其实现性能对整体后量子密码方案性能具有重要影响。考虑到今后PQC在多种设备场景下部署的巨大需求,SHA-3的硬件实现面临着高性能与有限资源开销相互制约的瓶颈挑战。对此,本文提出了一种高效高速的SHA-3硬件结构,这种结构可以应用于所有的SHA-3家族函数中。首先,本设计将64 bit轮常数简化为7 bit,既减少了轮常数所需的存储空间,也降低了运算复杂度。其次,提出了一种新型的流水线结构,这种新型结构相比于通常的流水线结构对关键路径分割得更加均匀。最后,将新型流水线结构与展开的优化方法结合,使系统的吞吐量大幅提高。本设计基于XilinxVirtex-6现场可编程逻辑阵列(FPGA)完成了原型实现,结果显示,所设计的SHA-3硬件单元最高工作频率可达459 MHz,效率达到14.71 Mbps/Slice。相比于现有的相关设计,最大工作频率提高了10.9%,效率提升了28.2%。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号