首页 | 本学科首页   官方微博 | 高级检索  
     

基于三维残差稠密网络的人体行为识别算法
引用本文:郭明祥,宋全军,徐湛楠,董俊,谢成军. 基于三维残差稠密网络的人体行为识别算法[J]. 计算机应用, 2019, 39(12): 3482-3489. DOI: 10.11772/j.issn.1001-9081.2019061056
作者姓名:郭明祥  宋全军  徐湛楠  董俊  谢成军
作者单位:1. 中国科学院 合肥智能机械研究所, 合肥 230031;2. 中国科学技术大学, 合肥 230026
基金项目:国家重点研发计划项目(2017YFC0806504);安徽省科技强警项目(201904d07020007)。
摘    要:针对现有的人体行为识别算法不能充分利用网络多层次时空信息的问题,提出了一种基于三维残差稠密网络的人体行为识别算法。首先,所提算法使用三维残差稠密块作为网络的基础模块,模块通过稠密连接的卷积层提取人体行为的层级特征;其次,经过局部特征聚合自适应方法来学习人体行为的局部稠密特征;然后,应用残差连接模块来促进特征信息流动以及减轻训练的难度;最后,通过级联多个三维残差稠密块实现网络多层局部特征提取,并使用全局特征聚合自适应方法学习所有网络层的特征用以实现人体行为识别。设计的网络算法在结构上增强了对网络多层次时空特征的提取,充分利用局部和全局特征聚合学习到更具辨识力的特征,增强了模型的表达能力。在基准数据集KTH和UCF-101上的大量实验结果表明,所提算法的识别率(top-1精度)分别达到了93.52%和57.35%,与三维卷积神经网络(C3D)算法相比分别提升了3.93和13.91个百分点。所提算法框架有较好的鲁棒性和迁移学习能力,能够有效地处理多种视频行为识别任务。

关 键 词:人体行为识别  视频分类  三维残差稠密网络  深度学习  特征聚合  
收稿时间:2019-06-21
修稿时间:2019-09-14

Human behavior recognition algorithm based on three-dimensional residual dense network
GUO Mingxiang,SONG Quanjun,XU Zhannan,DONG Jun,XIE Chengjun. Human behavior recognition algorithm based on three-dimensional residual dense network[J]. Journal of Computer Applications, 2019, 39(12): 3482-3489. DOI: 10.11772/j.issn.1001-9081.2019061056
Authors:GUO Mingxiang  SONG Quanjun  XU Zhannan  DONG Jun  XIE Chengjun
Affiliation:1. Institute of Intelligent Machines, Chinese Academy of Sciences, Hefei Anhui 230031, China;2. University of Science and Technology of China, Hefei Anhui 230026, China
Abstract:Concerning the problem that the existing algorithm for human behavior recognition cannot fully utilize the multi-level spatio-temporal information of network, a human behavior recognition algorithm based on three-dimensional residual dense network was proposed. Firstly, the proposed network adopted the three-dimensional residual dense blocks as the building blocks, these blocks extracted the hierarchical features of human behavior through the densely-connected convolutional layer. Secondly, the local dense features of human behavior were learned by the local feature aggregation adaptive method. Thirdly, residual connection module was adopted to facilitate the flow of feature information and mitigate the difficulty of training. Finally, after realizing the multi-level local feature extraction by concatenating multiple three-dimensional residual dense blocks, the aggregation adaptive method for global feature was proposed to learn the features of all network layers for realizing human behavior recognition. In conclusion, the proposed algorithm has improved the extraction of network multi-level spatio-temporal features and the features with high discrimination are learned by local and global feature aggregation, which enhances the expression ability of model. The experimental results on benchmark datasets KTH and UCF-101 show that, the recognition rate (top-1 recognition accuracy) of the proposed algorithm can achieve 93.52% and 57.35% respectively, which outperforms that of Three-Dimensional Convolutional neural network (C3D) algorithm by 3.93 percentage points and 13.91 percentage points respectively. The proposed algorithm framework has excellent robustness and migration learning ability, and can effectively handle multiple video behavior recognition tasks.
Keywords:human behavior recognition   video classification   three-dimensional residual dense network   deep learning   feature aggregation
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号