基于三维图卷积与注意力增强的行为识别模型 Action Recognition Model Based on 3D Graph Convolution and Attention Enhanced期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于三维图卷积与注意力增强的行为识别模型

引用本文：	曹毅,刘晨,盛永健,黄子龙,邓小龙.基于三维图卷积与注意力增强的行为识别模型[J].电子与信息学报,2021,43(7):2071-2078.

作者姓名：	曹毅刘晨盛永健黄子龙邓小龙

作者单位：	1.江南大学机械工程学院无锡 2141222.江南大学江苏省食品制造装备重点实验室无锡 2141223.江苏信息职业技术学院无锡 214153

基金项目：	国家自然科学基金(51375209)，江苏省“六大人才高峰”计划项目(ZBZZ-012)，江苏省优秀科技创新团队基金(2019SK07)，高等学校学科创新引智计划(B18027)，江南大学研究生科研与实践创新计划项目(JNSJ19_005, JNKY19_048)

摘要：	针对当前行为识别方法无法有效提取非欧式3维骨架序列的时空信息与缺乏针对特定关节关注的问题，该文提出了一种基于3维图卷积与注意力增强的行为识别模型。首先，介绍了3维卷积与图卷积的具体工作原理；其次，基于图卷积中可处理变长邻居节点的图卷积核，引入3维卷积的3维采样空间将2维图卷积核改进为具有3维采样空间的3维图卷积核，提出一种3维图卷积方法。针对3维采样空间内的邻居节点，通过3维图卷积核，实现了对骨架序列中时空信息的有效提取；然后，为增强对于特定关节的关注，聚焦重要的动作信息，设计了一种注意力增强结构；再者，结合3维图卷积方法与注意力增强结构，构建了基于3维图卷积与注意力增强的行为识别模型；最后，基于NTU-RGBD和MSR Action 3D骨架动作数据集开展了骨架行为识别的研究。研究结果进一步验证了基于3维图卷积与注意力增强的行为识别模型针对时空信息的有效提取能力及识别准确率。
关键词：	行为识别 3维图卷积注意力增强时空信息
收稿时间：	2020-06-04
Action Recognition Model Based on 3D Graph Convolution and Attention Enhanced

Yi CAO,Chen LIU,Yongjian SHENG,Zilong HUANG,Xiaolong DENG.Action Recognition Model Based on 3D Graph Convolution and Attention Enhanced[J].Journal of Electronics & Information Technology,2021,43(7):2071-2078.

Authors:	Yi CAO Chen LIU Yongjian SHENG Zilong HUANG Xiaolong DENG

Affiliation:	1.School of Mechanical Engineering, Jiangnan University, Wuxi 214122, China2.Jiangsu Key Laboratory of Advanced Food Manufacturing Equipment and Technology, Jiangnan University, Wuxi 214122, China3.Jiangsu Information Vocational and Technical College, Wuxi 214153, China

Abstract:	To solve the problems that current behavior recognition methods can not effectively extract the spatial-temporal information in non-European 3D skeleton sequence and lack attention for specific joints, an action recognition model based on 3D graph convolution and attention enhanced is proposed in this paper. Firstly, the specific working principles of the 3D convolution and graph convolution are introduced; Secondly, a 3D graph convolution method is proposed. It is based on the graph convolution kernel that can handle variable-length neighbor nodes in graph and 3D sampling space of 3D convolution is introduced to improve 2D graph convolution kernel to 3D graph convolution kernel with 3D sampling space. For neighbor nodes in 3D sampling space, this method realizes effective extraction of spatial-temporal information with a 3D graph convolution kernel; Thirdly, in order to enhance attention to specific joints and focus important action information, an attention enhanced structure is designed. Besides, through combining 3D graph convolution with attention enhanced structure, action recognition model based on 3D graph convolution and attention enhanced is proposed. Finally, the researches are carried on NTU-RGBD and MSR Action 3D skeleton action dataset. The results further verify the ability to extract spatial-temporal information of this model and its classification accuracy.

Keywords:

	点击此处可从《电子与信息学报》浏览原始摘要信息
	点击此处可从《电子与信息学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏