首页 | 本学科首页   官方微博 | 高级检索  
     

基于ShuffleNetv2-YOLOv3模型的静态手势实时识别方法
引用本文:辛文斌,郝惠敏,卜明龙,兰媛,黄家海,熊晓燕.基于ShuffleNetv2-YOLOv3模型的静态手势实时识别方法[J].浙江大学学报(自然科学版 ),2021,55(10):1815-1824.
作者姓名:辛文斌  郝惠敏  卜明龙  兰媛  黄家海  熊晓燕
作者单位:1. 太原理工大学 机械与运载工程学院,山西 太原 0300242. 哈尔滨电机厂有限责任公司,黑龙江 哈尔滨 150040
基金项目:国家重点研发计划资助项目(2018YFB1308700);2020年山西省关键核心技术和共性技术研发攻关专项项目(2020XXX009,2020XXX001)
摘    要:针对移动端平台下计算资源有限、存储空间小的特点,提出高效的ShuffleNetv2及YOLOv3集成网络静态手势实时识别方法,以减小模型对硬件的计算能力需求. 通过将轻量化网络ShuffleNetv2代替Darknet-53作为主干网络,减小模型的计算复杂度. 引入CBAM注意力机制模块,加强网络对空间和通道的关注度. 采用K-means聚类算法,重新生成Anchors的长宽比和数量,使重新生成的Anchors尺寸对目标进行精确定位来提高模型的检测精度. 实验结果表明,提出算法在手势识别上的平均识别准确率为99.2%,识别速度为44帧/s,单张416×416图片在GPU上的推理时间为15 ms,CPU上的推理时间为58 ms,模型所占内存为15.1 MB. 该方法具有识别精度高、识别速度快、内存占用率低等优点,有利于模型在移动终端上部署.

关 键 词:YOLOv3  轻量化ShuffleNetv2网络  CBAM注意力机制  手势识别  移动终端  

Static gesture real-time recognition method based on ShuffleNetv2-YOLOv3 model
Wen-bin XIN,Hui-min HAO,Ming-long BU,Yuan LAN,Jia-hai HUANG,Xiao-yan XIONG.Static gesture real-time recognition method based on ShuffleNetv2-YOLOv3 model[J].Journal of Zhejiang University(Engineering Science),2021,55(10):1815-1824.
Authors:Wen-bin XIN  Hui-min HAO  Ming-long BU  Yuan LAN  Jia-hai HUANG  Xiao-yan XIONG
Abstract:An efficient ShuffleNetv2 and YOLOv3 integrated network static gesture real-time recognition method was proposed to reduce the computing power requirements of the model on the hardware aiming at the characteristics of limited computing resources and small storage space under the mobile terminal platform. The computational complexity of the model was reduced by replacing Darknet-53 with the lightweight network ShuffleNetv2 as the backbone network. The CBAM attention mechanism module was introduced to strengthen the network’s attention to space and channels. The K-means clustering algorithm was used to regenerate the aspect ratio and number of Anchors, so that the regenerated Anchors size can accurately locate the target to improve the detection accuracy of the model. The experimental results showed that the average recognition accuracy of the proposed algorithm on gesture recognition was 99.2%, and the recognition speed was 44 frames/s. The inference time of a single 416×416 picture on the GPU was 15 ms, and the inference time on the CPU was 58 ms. The memory occupied by the model was 15.1 MB. The method has the advantages of high recognition accuracy, fast recognition speed, and low memory occupancy rate, which is conducive to the deployment of models on mobile terminals.
Keywords:YOLOv3  lightweight ShuffleNetv2 network  CBAM attention mechanism  gesture recognition  mobile terminal  
点击此处可从《浙江大学学报(自然科学版 )》浏览原始摘要信息
点击此处可从《浙江大学学报(自然科学版 )》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号