首页 | 本学科首页   官方微博 | 高级检索  
     

基于改进的深度神经网络的人体动作识别模型
引用本文:何冰倩,魏维,张斌,高联欣,宋岩贝. 基于改进的深度神经网络的人体动作识别模型[J]. 计算机应用研究, 2019, 36(10)
作者姓名:何冰倩  魏维  张斌  高联欣  宋岩贝
作者单位:成都信息工程大学计算机学院,成都,610225;成都信息工程大学软件工程学院,成都,610225
基金项目:四川省教育厅重点科研资助项目(17ZA0064)
摘    要:针对现有人体动作识别方法需输入固定长度的视频段、未充分利用时空信息等问题,提出一种基于时空金字塔和注意力机制相结合的深度神经网络模型,将包含时空金字塔的3D-CNN和添加时空注意力机制的LSTM模型相结合,实现了对视频段的多尺度处理和对动作的复杂时空信息的充分利用。以RGB图像和光流场作为空域和时域的输入,以融合金字塔池化层的运动和外观特征后的融合特征作为融合域的输入,最后采用决策融合策略获得最终动作识别结果。在UCF101和HMDB51数据集上进行实验,分别取得了94.2%和70.5%的识别准确率。实验结果表明,改进的网络模型在基于视频的人体动作识别任务上获得了较高的识别准确率。

关 键 词:动作识别  深度学习  时空金字塔  注意力机制  卷积神经网络
收稿时间:2018-06-21
修稿时间:2019-08-22

Improved deep convolutional neural network for human action recognition
He Bingqian,Wei Wei,Zhang Bin,Gao Lianxin and Song Yanbei. Improved deep convolutional neural network for human action recognition[J]. Application Research of Computers, 2019, 36(10)
Authors:He Bingqian  Wei Wei  Zhang Bin  Gao Lianxin  Song Yanbei
Affiliation:College of Computer Science and Technology,Chengdu University of Information Technology,,,,
Abstract:Aiming at the problem that the existing human motion recognition method needed to input a fixed length video segment and underutilized the spatio-temporal information, this paper proposed a deep neural network model based on the combination of space-time pyramid and attention mechanism. This improved architecture combined 3D-CNN including spatio-temporal pyramids with LSTM model with spatio-temporal attention mechanism, and realized multi-scale processing of video segments and full utilization of complex spatio-temporal information of actions. For the architecture, the inputs of spatial and temporal domain were RGB image and the optical flow, the input of the fusion domain was the fusion feature of the motion and appearance features of the pyramid pooling layer. Finally, it used the decision fusion strategy to obtain the final motion recognition result. Experiments were performed on the UCF101 and HMDB51 datasets, it achieved 94.2% and 70.5% recognition accuracy, respectively. The experimental results show that the improved network model achieves high recognition accuracy in video based human motion recognition tasks.
Keywords:action recognition   deep learning   spatio-temporal pyramid   attention module   convolutional neural network
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号