首页 | 本学科首页   官方微博 | 高级检索  
     

面向人体动作识别的局部特征融合时间卷积网络
引用本文:宋震,周元峰,贾金公,辛士庆,刘毅. 面向人体动作识别的局部特征融合时间卷积网络[J]. 计算机辅助设计与图形学学报, 2020, 32(3): 418-424
作者姓名:宋震  周元峰  贾金公  辛士庆  刘毅
作者单位:山东大学软件学院 济南 250101;山东大学计算机科学与技术学院 青岛 266237
基金项目:国家自然科学基金;山东省重点研发计划项目;山东省自然科学基金;基本科研业务费项目
摘    要:针对3D人体骨架序列动作识别这一问题,提出了一种结合了局部特征融合的时间卷积网络方法.首先,对一个动作中整个骨架序列的所有关节点的空间位置变化进行建模,提取其骨架序列的全局空间特征;然后,根据人体关节点及连接关系的拓扑结构将全局空间特征划分为人体局部空间特征,并将得到的局部空间特征分别作为对应TCN的输入,进而学习各关节内部的特征关系;最后,对输出的各部分特征向量进行融合,学习各部分关节之间的协作关系,从而完成对动作的识别.运用该方法在当前最具挑战性的数据集NTU-RGB+D进行了分类识别实验,结果表明,与已有的基于CNN,LSTM以及TCN的方法相比,其在对象交叉(cross-subject)和视图交叉(cross-view)的分类准确率上分别提高到了79.5%和84.6%.

关 键 词:动作识别  时间卷积网络  3D人体骨架

Local Feature Fusion Temporal Convolutional Network for Human Action Recognition
Song Zhen,Zhou Yuanfeng,Jia Jingong,Xin Shiqing,Liu Yi. Local Feature Fusion Temporal Convolutional Network for Human Action Recognition[J]. Journal of Computer-Aided Design & Computer Graphics, 2020, 32(3): 418-424
Authors:Song Zhen  Zhou Yuanfeng  Jia Jingong  Xin Shiqing  Liu Yi
Affiliation:(School of Software,Shandong University,Jinan 250101;School of Computer Science and Technology,Shandong University,Qingdao 266237)
Abstract:Aiming at the problem of action recognition of the three-dimensional human skeleton sequences, a temporal convolutional network(TCN) method combining local feature fusion is proposed. Firstly, the global spatial feature of the skeleton sequence is extracted by modeling all the spatial location changes of the skeleton sequence in an action. Then, according to the topological structure of human body joints and connection relations, the global spatial features are divided into local spatial features of the human body, and the obtained local spatial features are taken as the input of corresponding TCN to learn the internal feature relations of each joint. Finally, the feature vectors of each part of the output are fused to learn the cooperative relationship between the joints of each part, to complete the recognition of the action. Classification and recognition experiments are carried out on the most challenging data set NTU-RGB+D by the proposed method. The results show that compared with the existing methods based on CNN, LSTM and TCN, the classification accuracy of cross-subject and cross-view is improved to 79.5% and 84.6%, respectively.
Keywords:action recognition  temporal convolutional network  three-dimensional human skeleton
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号