面向人体动作识别的局部特征融合时间卷积网络 Local Feature Fusion Temporal Convolutional Network for Human Action Recognition期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

面向人体动作识别的局部特征融合时间卷积网络

引用本文：	宋震,周元峰,贾金公,辛士庆,刘毅. 面向人体动作识别的局部特征融合时间卷积网络[J]. 计算机辅助设计与图形学学报, 2020, 32(3): 418-424

作者姓名：	宋震周元峰贾金公辛士庆刘毅

作者单位：	山东大学软件学院济南 250101;山东大学计算机科学与技术学院青岛 266237

基金项目：	国家自然科学基金;山东省重点研发计划项目;山东省自然科学基金;基本科研业务费项目

摘要：	针对3D人体骨架序列动作识别这一问题,提出了一种结合了局部特征融合的时间卷积网络方法.首先,对一个动作中整个骨架序列的所有关节点的空间位置变化进行建模,提取其骨架序列的全局空间特征;然后,根据人体关节点及连接关系的拓扑结构将全局空间特征划分为人体局部空间特征,并将得到的局部空间特征分别作为对应TCN的输入,进而学习各关节内部的特征关系;最后,对输出的各部分特征向量进行融合,学习各部分关节之间的协作关系,从而完成对动作的识别.运用该方法在当前最具挑战性的数据集NTU-RGB+D进行了分类识别实验,结果表明,与已有的基于CNN,LSTM以及TCN的方法相比,其在对象交叉(cross-subject)和视图交叉(cross-view)的分类准确率上分别提高到了79.5%和84.6%.
关键词：	动作识别时间卷积网络 3D人体骨架
Local Feature Fusion Temporal Convolutional Network for Human Action Recognition

Song Zhen,Zhou Yuanfeng,Jia Jingong,Xin Shiqing,Liu Yi. Local Feature Fusion Temporal Convolutional Network for Human Action Recognition[J]. Journal of Computer-Aided Design & Computer Graphics, 2020, 32(3): 418-424

Authors:	Song Zhen Zhou Yuanfeng Jia Jingong Xin Shiqing Liu Yi

Affiliation:	(School of Software,Shandong University,Jinan 250101;School of Computer Science and Technology,Shandong University,Qingdao 266237)

Abstract:	Aiming at the problem of action recognition of the three-dimensional human skeleton sequences, a temporal convolutional network(TCN) method combining local feature fusion is proposed. Firstly, the global spatial feature of the skeleton sequence is extracted by modeling all the spatial location changes of the skeleton sequence in an action. Then, according to the topological structure of human body joints and connection relations, the global spatial features are divided into local spatial features of the human body, and the obtained local spatial features are taken as the input of corresponding TCN to learn the internal feature relations of each joint. Finally, the feature vectors of each part of the output are fused to learn the cooperative relationship between the joints of each part, to complete the recognition of the action. Classification and recognition experiments are carried out on the most challenging data set NTU-RGB+D by the proposed method. The results show that compared with the existing methods based on CNN, LSTM and TCN, the classification accuracy of cross-subject and cross-view is improved to 79.5% and 84.6%, respectively.

Keywords:	action recognition temporal convolutional network three-dimensional human skeleton
本文献已被 CNKI 维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏