首页 | 本学科首页   官方微博 | 高级检索  
     

基于时空注意力网络的中国手语识别
引用本文:罗元,李丹,张毅.基于时空注意力网络的中国手语识别[J].半导体光电,2020,41(3):414-419.
作者姓名:罗元  李丹  张毅
作者单位:重庆邮电大学 光电工程学院重庆 400065;重庆邮电大学 信息无障碍与服务机器人工程技术研究中心, 重庆 400065
基金项目:国家自然科学基金项目(61801061); 重庆市教委科学技术研究项目(KJQN201800607).
摘    要:手语识别广泛应用于聋哑人与正常人之间的交流中。针对手语识别任务中时空特征提取不充分而导致识别率低的问题,提出了一种新颖的基于时空注意力的手语识别模型。首先提出了基于残差3D卷积网络(Residual 3DConvolutional Neural Network,Res3DCNN)的空间注意力模块,用来自动关注空间中的显著区域;随后提出了基于卷积长短时记忆网络(Convolutional Long Short-Term Memory,ConvLSTM)的时间注意力模块,用来衡量视频帧的重要性。所提算法的关键在于在空间中关注显著区域,并且在时间上自动选择关键帧。最后,在CSL手语数据集上验证了算法的有效性。

关 键 词:手语识别  时空注意力  残差3D网络  卷积LSTM网络
收稿时间:2019/12/30 0:00:00

Chinese Sign Language Recognition Based on Spatial-Temporal Attention Network
LUO Yuan,LI Dan,ZHANG Yi.Chinese Sign Language Recognition Based on Spatial-Temporal Attention Network[J].Semiconductor Optoelectronics,2020,41(3):414-419.
Authors:LUO Yuan  LI Dan  ZHANG Yi
Affiliation:Institute of Photoelectric Engin.; Engin.Research Center for Information Accessibility and Service Robots, Chongqing University of Posts and Telecommunications, Chongqing 400065, CHN
Abstract:Sign language recognition is widely used in communication between deaf-mute and ordinary people. In adequate extraction of spatial-temporal features in sign language recognition task is likely to result in low recognition rate. In this paper, proposed is a novel sign language recognition model based on spatial-temporal attention which can learn more discriminative spatial-temporal features. Specially, a new spatial attention module based on residual 3D convolutional neural network (Res3DCNN) is proposed, which automatically focus on the salient areas in the spatial region. Then, to measure the importance of video frames, a new temporal attention module based on convolutional long short-term memory (ConvLSTM) is introduced. The crucial purpose of the proposed model is to focus on the salient areas spatially and pay attention to the key video frames temporally. Lastly, experimental results demonstrate the efficiency of the proposed method on the Chinese sign language (CSL) dataset.
Keywords:sign language recognition  spatial-temporal attention  Res3DCNN  ConvLSTM
本文献已被 维普 等数据库收录!
点击此处可从《半导体光电》浏览原始摘要信息
点击此处可从《半导体光电》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号