面向“边缘”应用的卷积神经网络量化与压缩方法 CNN quantization and compression strategy for edge computing applications期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

面向“边缘”应用的卷积神经网络量化与压缩方法

引用本文：	蔡瑞初,钟椿荣,余洋,陈炳丰,卢冶,陈瑶.面向“边缘”应用的卷积神经网络量化与压缩方法[J].计算机应用,2018,38(9):2449-2454.

作者姓名：	蔡瑞初钟椿荣余洋陈炳丰卢冶陈瑶

作者单位：	1. 广东工业大学计算机学院, 广州 510006;2. 南开大学计算机与控制工程学院, 天津 300353;3. 新加坡高等数字科学中心, 新加坡 138602

基金项目：	NSFC-广东联合基金资助项目（U1501254）；广东省杰出青年科学基金资助项目（2014A030306004）；福建省信息处理与智能控制重点实验室开放课题（MJUKF201733）。

摘要：	针对卷积神经网络（CNN）推理计算所需内存空间和资源过大，限制了其在嵌入式等"边缘"设备上部署的问题，提出结合网络权重裁剪及面向嵌入式硬件平台数据类型的数据量化的神经网络压缩方法。首先，根据卷积神经网络各层权重的分布，采用阈值法对网络精确率影响较小的权重进行裁剪，保留网络中重要连接的同时除去冗余信息；其次，针对嵌入式平台的计算特性分析网络中权重及激活函数所需的数据位宽，采用动态定点量化方法减小权重数据的位宽；最后，对网络进行微调，在保障网络模型识别精度的前提下进一步压缩模型大小并降低计算消耗。实验结果表明，该方法降低了VGG-19网络95.4%的存储空间而精确率仅降低0.3个百分点，几乎实现无损压缩；同时，通过多个网络模型的验证，该方法在平均1.46个百分点精确率变化范围内，最大降低网络模型96.12%的存储空间，能够有效地压缩卷积神经网络。
关键词：	卷积神经网络边缘计算网络裁剪数据量化网络压缩
收稿时间：	2018-03-12
修稿时间：	2018-04-17
CNN quantization and compression strategy for edge computing applications

CAI Ruichu,ZHONG Chunrong,YU Yang,CHEN Bingfeng,LU Ye,CHEN Yao.CNN quantization and compression strategy for edge computing applications[J].journal of Computer Applications,2018,38(9):2449-2454.

Authors:	CAI Ruichu ZHONG Chunrong YU Yang CHEN Bingfeng LU Ye CHEN Yao

Affiliation:	1. College of Computer Science, Guangdong University of Technology, Guangzhou Guangdong 510006, China;2. College of Computer and Control Engineering, Nankai University, Tianjin 300353, China;3. Advanced Digital Sciences Center, Singapore 138602, Singapore

Abstract:	Focused on the problem that the memory and computational resource intensive nature of Convolutional Neural Network (CNN) limits the adoption of CNN on embedded devices such as edge computing, a convolutional neural network compression method combining network weight pruning and data quantization for embedded hardware platform data types was proposed. Firstly, according to the weights distribution of each layer of the original CNN, a threshold based pruning method was illustrated to eliminate the weights that have less impact on the network processing accuracy. The redundant information in the network model was removed while the important connections were preserved. Secondly, the required bit-width of the weights and activation functions were analyzed based on the computational characteristics of the embedded platform, and the dynamic fixed-point quantization method was employed to reduce the bit-width of the network model. Finally, the network was fine-tuned to further compress the model size and reduce the computational consumption while ensuring the accuracy of model inference. The experimental results show that this method reduces the network storage space of VGG-19 by over 22 times while reducing the accuracy by only 0.3%, which achieves almost lossless compression. Meanwhile, by evaluating on multiple models, this method can reduce the storage space of the network model by a maximum of 25 times within the range of average accuracy lose of 1.46%, which proves the effective compression of the proposed method.

Keywords:	Convolution Neural Network (CNN) edge computing network pruning quantization network compressing

	点击此处可从《计算机应用》浏览原始摘要信息
	点击此处可从《计算机应用》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏