首页 | 本学科首页   官方微博 | 高级检索  
     

基于FPGA的卷积神经网络定点加速
引用本文:雷小康,尹志刚,赵瑞莲.基于FPGA的卷积神经网络定点加速[J].计算机应用,2020,40(10):2811-2816.
作者姓名:雷小康  尹志刚  赵瑞莲
作者单位:1. 北京化工大学 信息科学与技术学院, 北京 100029;2. 中国科学院 自动化研究所, 北京 100190
摘    要:针对卷积神经网络(CNN)在资源受限的硬件设备上运行功耗高及运行慢的问题,提出一种基于现场可编程门阵列(FPGA)的CNN定点计算加速方法。首先提出一种定点化方法,并且每层卷积设计不同的尺度参数,使用相对散度确定位宽的长度,以减小CNN参数的存储空间,而且研究不同量化区间对CNN精度的影响;其次,设计参数复用方法及流水线计算方法来加速卷积计算。为验证CNN定点化后的加速效果,采用了人脸和船舶两个数据集进行验证。结果表明,相较于传统的浮点卷积计算,所提方法在保证CNN精度损失很小的前提下,当权值参数和输入特征图参数量化到7-bit时,在人脸识别CNN模型上的压缩后的权重参数文件大小约为原来的22%,卷积计算加速比为18.69,同时使FPGA中的乘加器的利用率达94.5%。实验结果表明了该方法可以提高卷积计算速度,并且能够高效利用FPGA硬件资源。

关 键 词:卷积神经网络  定点量化  现场可编程门阵列  模型压缩  YOLO模型  
收稿时间:2020-03-16
修稿时间:2020-04-22

FPGA-based convolutional neural network fixed-point acceleration
LEI Xiaokang,YIN Zhigang,ZHAO Ruilian.FPGA-based convolutional neural network fixed-point acceleration[J].journal of Computer Applications,2020,40(10):2811-2816.
Authors:LEI Xiaokang  YIN Zhigang  ZHAO Ruilian
Affiliation:1. School of Information Science and Technology, Beijing University of Chemical Technology, Beijing 100029, China;2. Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
Abstract:Aiming at the problem of high running power consumption and slow operation of Convolutional Neural Network (CNN) on resource-constrained hardware devices, a method for accelerating fixed-point computation of CNN based on Field Programmable Gate Array (FPGA) was proposed. First, a fixed-point processing method was proposed. In order to reduce the storage space of the CNN parameters, different scale parameters were designed for different convolution layers and the relative divergence was used to determine the bit width length. The effect of different quantization intervals on the accuracy of CNN was studied. Then, the parameter multiplexing method and the pipeline calculation method were designed to accelerate the convolution calculation. In order to verify the acceleration effect of CNN after fixed-point processing, two datasets of face and ship were used for verification. Compared with the traditional floating-point convolution computation, on the premise of ensuring that the accuracy loss of the CNN is small, when the weight parameters and the input feature map parameters are quantized to 7-bit, on the face recognition CNN model, the proposed method has the compressed weight parameter file size of about 22% of the origin, and the convolution calculation speedup is 18.69. At the same time, the method makes the utilization rate of the multiplier-accumulator in FPGA reach 94.5%. Experimental results show that the proposed method can improve the speed of convolution calculation, and efficiently use FPGA hardware resources.
Keywords:Convolutional Neural Network (CNN)  fixed-point quantization  Field Programmable Gate Array (FPGA)  model compression  YOLO model  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号