卷积神经网络混合截断量化 Mixed-Clipping Quantization for Convolutional Neural Networks期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

卷积神经网络混合截断量化

引用本文：	黄钲喆,杜慧敏,常立博. 卷积神经网络混合截断量化[J]. 计算机辅助设计与图形学学报, 2021, 33(4): 553-559. DOI: 10.3724/SP.J.1089.2021.18509

作者姓名：	黄钲喆杜慧敏常立博

作者单位：	西安邮电大学电子工程学院西安 710121

摘要：	量化是压缩卷积神经网络、加速卷积神经网络推理的主要方法.现有的量化方法大多将所有层量化至相同的位宽,混合精度量化则可以在相同的压缩比下获得更高的准确率,但寻找混合精度量化策略是很困难的.为解决这种问题,提出了一种基于强化学习的卷积神经网络混合截断量化方法,使用强化学习的方法搜索混合精度量化策略,并根据搜索得到的量化策略混合截断权重数据后再进行量化,进一步提高了量化后网络的准确率.在ImageNet数据集上测试了ResNet18/50以及MobileNet-V2使用此方法量化前后的Top-1准确率,在COCO数据集上测试了YOLOV3网络量化前后的mAP.与HAQ, ZeroQ相比, MobileNet-V2网络量化至4位的Top-1准确率分别提高了2.7%和0.3%;与分层量化相比, YOLOV3网络量化至6位的mAP提高了2.6%.
关键词：	卷积神经网络混合精度量化强化学习混合截断
Mixed-Clipping Quantization for Convolutional Neural Networks

Huang Zhengzhe,Du Huimin,Chang Libo. Mixed-Clipping Quantization for Convolutional Neural Networks[J]. Journal of Computer-Aided Design & Computer Graphics, 2021, 33(4): 553-559. DOI: 10.3724/SP.J.1089.2021.18509

Authors:	Huang Zhengzhe Du Huimin Chang Libo

Affiliation:	(School of Electronic Engineering,Xi’an University of Posts&Telecommunications,Xi’an 710121)

Abstract:	Quantization is the main method to compress convolutional neural networks and accelerate convolutional neural network inference.Most existing quantization methods quantize all layers to the same bit width.Mixed-precision quantization can obtain higher precision under the same compression ratio,but it is difficult to find a mixed-precision quantization strategy.To solve this problem,a mixed-clipping quantization method based on reinforcement learning is proposed.It uses reinforcement learning to search for a mixed-precision quantization strategy,and uses a mixed-clipping method to clip weight data according to the searched quantization strategy before quantization.This method further improves the accuracy of the quantized network.We extensively test this method on a diverse set of models,including ResNet18/50,Mobile-Net-V2 on ImageNet,as well as YOLOV3 on the Microsoft COCO dataset.The experimental results show that our method can achieve 2.7%and 0.3%higher Top-1 accuracy on MobileNet-V2(4 bit),as compared to the HAQ and ZeroQ method.And our method can achieve 2.6%higher mAP on YOLOV3(6 bit),as compared to per-layer quantization method.

Keywords:	convolutional neural networks mixed-precision quantization reinforcement learning mixedclipping
本文献已被 CNKI 维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏