首页 | 本学科首页   官方微博 | 高级检索  
     

基于FPGA的CNN加速SoC系统设计
引用本文:赵烁,范军,何虎.基于FPGA的CNN加速SoC系统设计[J].计算机工程与设计,2020,41(4):939-944.
作者姓名:赵烁  范军  何虎
作者单位:清华大学微电子学研究所,北京100084;清华大学微电子学研究所,北京100084;清华大学微电子学研究所,北京100084
摘    要:为提高目前硬件运行卷积神经网络(CNN)的速度和能效,针对主流CNN网络的卷积计算设计加速模块并在FPGA上实现用于加速CNN网络的SoC系统。硬件平台采用带有ARM处理器的ZCU102 FPGA开发板,系统采用处理器和加速器的结构进行设计。加速器负责卷积计算,采用分块技术并重组卷积计算循环次序,使片上缓存的数据复用率更高,减少系统与内存之间数据的传输。支持1×1到11×11的卷积核尺寸,硬件支持的激活函数为ReLU和Leaky ReLU。处理器负责控制并处理CNN网络的其它计算,使SoC系统具有通用性和灵活性。实验结果表明,在100 MHz的工作频率下,峰值计算性能可以达到42.13 GFLOPS,相比CPU和其它FPGA计算的性能有一定提升。

关 键 词:卷积神经网络  图像处理  卷积加速  数据复用  软硬件协作

Design of CNN accelerated SoC system based on FPGA
ZHAO Shuo,FAN Jun,HE Hu.Design of CNN accelerated SoC system based on FPGA[J].Computer Engineering and Design,2020,41(4):939-944.
Authors:ZHAO Shuo  FAN Jun  HE Hu
Affiliation:(Institute of Microelectronics,Tsinghua University,Beijing 100084,China)
Abstract:To improve the speed and energy efficiency of the current hardware running convolutional neural network(CNN),an acceleration module was designed for the convolution calculation of the mainstream CNN network and a SoC system for accelerating the CNN network was implemented on the FPGA.The hardware platform was a ZCU102 FPGA development board with ARM processor.The system was designed with processor and accelerator structure.The accelerator was responsible for the convolution calculation,using tiling technology and reorganizing the loop order of convolution calculation,both of them made the data reuse rate of on-chip buffer better,thus greatly reducing the data transmission between system and memory.The convolution filter sizes of 1×1 to 11×11 were supported,while the activation functions supported by hardware were ReLU and Leaky ReLU.The processor was responsible for controlling and processing other computations of the CNN,which made the SoC system more versatile and flexible.Experimental results show that at the working frequency of 100 MHz,the peak performance can reach 42.13 GFLOPS,which is higher than the work of CPUs and others before.
Keywords:convolution neural network  image processing  convolution acceleration  data reuse  software and hardware collaboration
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号