首页 | 本学科首页   官方微博 | 高级检索  
     

基于残差双注意力与跨级特征融合模块的静态手势识别
引用本文:吴佳璐,田秋红,岳金鸿.基于残差双注意力与跨级特征融合模块的静态手势识别[J].计算机系统应用,2022,31(11):111-119.
作者姓名:吴佳璐  田秋红  岳金鸿
作者单位:浙江理工大学 信息学院, 杭州 310018
基金项目:国家自然科学基金(51405448); 浙江理工大学博士科研启动项目(11122932611817); 国家级大学生创新创业训练计划(11120032382104); 浙江省大学生科技成果推广项目 (14530031661961); 浙江理工大学信息学院教育教学改革研究项目(11120033312202)
摘    要:为解决卷积神经网络提取特征遗漏、手势多特征提取不充分问题, 本文提出基于残差双注意力与跨级特征融合模块的静态手势识别方法. 设计了一种残差双注意力模块, 该模块对ResNet50网络提取的低层特征进行增强, 能够有效学习关键信息并更新权重, 提高对高层特征的注意力, 然后由跨级特征融合模块对不同阶段的高低层特征进行融合, 丰富高级特征图中不同层级之间的语义和位置信息, 最后使用全连接层的Softmax分类器对手势图像进行分类识别. 本文在ASL美国手语数据集上进行实验, 平均准确率为99.68%, 相比基础ResNet50网络准确率提升2.52%. 结果验证本文方法能充分提取与复用手势特征, 有效提高手势图像的识别精度.

关 键 词:手势图像识别  ResNet  残差双注意力模块  跨级特征融合  深度学习
收稿时间:2022/2/12 0:00:00
修稿时间:2022/3/14 0:00:00

Static Gesture Recognition Based on Residual Double Attention Module and Cross-level Feature Fusion
WU Jia-Lu,TIAN Qiu-Hong,YUE Jin-Hong.Static Gesture Recognition Based on Residual Double Attention Module and Cross-level Feature Fusion[J].Computer Systems& Applications,2022,31(11):111-119.
Authors:WU Jia-Lu  TIAN Qiu-Hong  YUE Jin-Hong
Affiliation:School of Information Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310018, China
Abstract:To solve the problems of missing feature extraction by convolutional neural network and insufficient multi-feature extraction of a gesture, this study proposes a static gesture recognition method based on a residual double attention module and a cross-level feature fusion module. The designed residual double attention module can enhance the low-level features extracted by a ResNet50 network, effectively learn the key information, update the weight, and improve the attention to high-level features. Then, the cross-level feature fusion module fuses the high-level and low-level features in different stages to enrich the semantic and location information between different levels in the high-level feature map. Finally, the Softmax classifier of the fully connected layer is used to classify and recognize the gesture image. The experiment is carried out on the American sign language (ASL) dataset. The average recognition accuracy is 99.68%, which is 2.52% higher than that of the basic ResNet50 network. The results show that the proposed method can fully extract and reuse gesture features and effectively improve the recognition accuracy of gesture images.
Keywords:gesture image recognition  ResNet  residual double attention module  cross-level feature fusion module  deep learning
点击此处可从《计算机系统应用》浏览原始摘要信息
点击此处可从《计算机系统应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号