首页 | 本学科首页   官方微博 | 高级检索  
     

基于OpenCL的3DES算法FPGA加速器
引用本文:吴健凤,郑博文,聂一,柴志雷.基于OpenCL的3DES算法FPGA加速器[J].计算机工程,2021,47(12):147-155,162.
作者姓名:吴健凤  郑博文  聂一  柴志雷
作者单位:1. 江南大学 人工智能与计算机学院, 江苏 无锡 214122;2. 江南大学 物联网工程学院, 江苏 无锡 214122;3. 数学工程与先进计算国家重点实验室, 江苏 无锡 214215
基金项目:国家自然科学基金(61972180);数学工程与先进计算国家重点实验室开放基金(2018A04)。
摘    要:在数字货币、区块链、云端数据加密等领域,传统以软件方式运行的数据加解密存在计算速度慢、占用主机资源、功耗高等问题,而以Verilog/VHDL等方式实现的现场可编程门阵列(FPGA)加解密系统又存在开发周期长、维护升级困难等问题。针对3DES算法,提出一种基于OpenCL的FPGA加速器设计方案。设计具有48轮迭代的流水并行结构,在数据传输模块中采用数据存储调整、数据位宽改进策略提高内核实际带宽利用率,在算法加密模块中采用指令流优化策略形成流水线并行架构,同时采用内核矢量化、计算单元复制策略进一步提高内核性能。实验结果表明,该加速器在Intel Stratix 10 GX2800上可获得111.801 Gb/s的吞吐率,与Intel Core i7-9700 CPU相比性能提升372倍,能效提升644倍,与NvidiaGeForce GTX 1080Ti GPU相比性能提升20%,能效提升9倍。

关 键 词:OpenCL框架  现场可编程门阵列  加解密算法  3DES算法  流水并行结构  
收稿时间:2020-10-20
修稿时间:2020-12-07

FPGA Accelerator for 3DES Algorithm Based on OpenCL
WU Jianfeng,ZHENG Bowen,NIE Yi,CHAI Zhilei.FPGA Accelerator for 3DES Algorithm Based on OpenCL[J].Computer Engineering,2021,47(12):147-155,162.
Authors:WU Jianfeng  ZHENG Bowen  NIE Yi  CHAI Zhilei
Affiliation:1. School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, Jiangsu 214122, China;2. School of Internet of Things Engineering, Jiangnan University, Wuxi, Jiangsu 214122, China;3. State Key Laboratory of Mathematical Engineering and Advanced Computing, Wuxi, Jiangsu 214215, China
Abstract:Nowadays, encryption and decryption algorithms are widely used in digital currency, blockchain, cloud data encryption and other fields.Traditional software-based data encryption is limited in the calculation speed while occupying many host resources and having high power consumption.Also, Field Programmable Gate Array(FPGA)-based encryption and decryption implemented in Verilog/VHDL suffer from the long development cycles and difficult maintenance and upgrades.To address the above problems, a design scheme of a FPGA accelerator for 3DES algorithm based on OpenCL is proposed.In the scheme, a pipeline parallel structure with 48 iterations is designed by adjusting data storage, improving data bit width, optimizing instruction stream, vectorising Kernels and replicating compute units.For the data transmission module, the actual bandwidth utilization of the Kernel is improved by adjusting data storage and increasing data bit width.For the algorithm encryption module, the instruction stream is optimized to form a pipeline parallel architecture.In addition, the performance of the Kernel is further improved by kernel vectorization and compute unit replication strategies.The experimental results show that the accelerator provides a throughput rate of 111.801 Gb/s on Intel Stratix 10 GX2800.Compared with the Intel Core i7-9700 CPU, the proposed accelerator improves the performance by 372 times and the energy efficiency by 644 times.Compared with the Nvidia GeForce GTX 1080Ti GPU, the proposed accelerator improves the performance by 20% and the energy efficiency by 9 times.
Keywords:OpenCL framework  Field Programmable Gate Array(FPGA)  encryption and decryption algorithm  3DES algorithm  pipeline parallel structure  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机工程》浏览原始摘要信息
点击此处可从《计算机工程》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号