首页 | 本学科首页   官方微博 | 高级检索  
     

卷积神经网络混合截断量化
引用本文:黄钲喆,杜慧敏,常立博. 卷积神经网络混合截断量化[J]. 计算机辅助设计与图形学学报, 2021, 33(4): 553-559. DOI: 10.3724/SP.J.1089.2021.18509
作者姓名:黄钲喆  杜慧敏  常立博
作者单位:西安邮电大学电子工程学院 西安 710121
摘    要:量化是压缩卷积神经网络、加速卷积神经网络推理的主要方法.现有的量化方法大多将所有层量化至相同的位宽,混合精度量化则可以在相同的压缩比下获得更高的准确率,但寻找混合精度量化策略是很困难的.为解决这种问题,提出了一种基于强化学习的卷积神经网络混合截断量化方法,使用强化学习的方法搜索混合精度量化策略,并根据搜索得到的量化策略混合截断权重数据后再进行量化,进一步提高了量化后网络的准确率.在ImageNet数据集上测试了ResNet18/50以及MobileNet-V2使用此方法量化前后的Top-1准确率,在COCO数据集上测试了YOLOV3网络量化前后的mAP.与HAQ, ZeroQ相比, MobileNet-V2网络量化至4位的Top-1准确率分别提高了2.7%和0.3%;与分层量化相比, YOLOV3网络量化至6位的mAP提高了2.6%.

关 键 词:卷积神经网络  混合精度量化  强化学习  混合截断

Mixed-Clipping Quantization for Convolutional Neural Networks
Huang Zhengzhe,Du Huimin,Chang Libo. Mixed-Clipping Quantization for Convolutional Neural Networks[J]. Journal of Computer-Aided Design & Computer Graphics, 2021, 33(4): 553-559. DOI: 10.3724/SP.J.1089.2021.18509
Authors:Huang Zhengzhe  Du Huimin  Chang Libo
Affiliation:(School of Electronic Engineering,Xi’an University of Posts&Telecommunications,Xi’an 710121)
Abstract:Quantization is the main method to compress convolutional neural networks and accelerate convolutional neural network inference.Most existing quantization methods quantize all layers to the same bit width.Mixed-precision quantization can obtain higher precision under the same compression ratio,but it is difficult to find a mixed-precision quantization strategy.To solve this problem,a mixed-clipping quantization method based on reinforcement learning is proposed.It uses reinforcement learning to search for a mixed-precision quantization strategy,and uses a mixed-clipping method to clip weight data according to the searched quantization strategy before quantization.This method further improves the accuracy of the quantized network.We extensively test this method on a diverse set of models,including ResNet18/50,Mobile-Net-V2 on ImageNet,as well as YOLOV3 on the Microsoft COCO dataset.The experimental results show that our method can achieve 2.7%and 0.3%higher Top-1 accuracy on MobileNet-V2(4 bit),as compared to the HAQ and ZeroQ method.And our method can achieve 2.6%higher mAP on YOLOV3(6 bit),as compared to per-layer quantization method.
Keywords:convolutional neural networks  mixed-precision quantization  reinforcement learning  mixedclipping
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号