基于注意力机制的多模态人体行为识别算法 Multi-modal human behavior recognition algorithm based on attention mechanism期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于注意力机制的多模态人体行为识别算法

引用本文：	宋真东,杨国超,马玉鹏,冯晓毅.基于注意力机制的多模态人体行为识别算法[J].计算机测量与控制,2022,30(2):276-283.

作者姓名：	宋真东杨国超马玉鹏冯晓毅

作者单位：	西北工业大学电子信息学院,西安710129;陕西华明普泰医疗设备有限公司,西安 710119,西北工业大学电子信息学院,西安710129,西北工业大学电子信息学院,西安710129;河北师范大学计算机与网络空间安全学院,石家庄 050024

基金项目：	陕西省重点研发项目(2021ZDLGY15-01, 2021ZDLGY09-04, 2021GY-004和2020GY-050)；深圳市国际合作研究项目(GJHZ20200731095204013)；国家自然基金(61772419)

摘要：	提出了基于注意力机制的多模态人体行为识别算法;针对多模态特征的有效融合问题,设计基于注意力机制的双流特征融合卷积网络(TAM3DNet,two-stream attention mechanism 3D network);主干网络采用结合注意力机制的注意力3D网络(AM3DNet,attention mechanism 3D network),将特征图与注意力图进行加权后得到加权行为特征,从而使网络聚焦于肢体运动区域的特征,减弱背景和肢体静止区域的影响;将RGB-D数据的颜色和深度两种模态数据分别作为双流网络的输入,从两条分支网络得到彩色和深度行为特征,然后将融合特征进行分类得到人体行为识别结果。
关键词：	RGB-D图像多模态特征人体行为双流网络注意力机制特征融合
收稿时间：	2022/1/7 0:00:00
修稿时间：	2022/1/11 0:00:00
Multi-modal human behavior recognition algorithm based on attention mechanism

SONG Zhendong,YANG Guochao,MA Yupeng,FENG Xiaoyi.Multi-modal human behavior recognition algorithm based on attention mechanism[J].Computer Measurement & Control,2022,30(2):276-283.

Authors:	SONG Zhendong YANG Guochao MA Yupeng FENG Xiaoyi

Affiliation:	(School of Electronics and Information,Northwestern Polytechnical University,Xi'an 710129,China;Shanxi Huaming Putai Medical Equipment,Xi'an 710119,China;College of Computer and Cyber Security,Hebei Normal University,Shijiazhuang 050024,China)

Abstract:	A multi-modal human behavior recognition algorithm based on attention mechanism is proposed. Aiming at the problem of effective fusion of multimodal features, a two-stream feature fusion convolutional network TAM3DNet (Two-stream Attention Mechanism 3D Network) based on attention mechanism is designed. The backbone network adopts AM3DNet (Attention Mechanism 3D Network), which combines the attention mechanism, and weights the feature map and the attention map to obtain the weighted behavior characteristics, so that the network focuses on the characteristics of the limb movement area and reduces the influence of the background and the static area of the limb. The color and depth modal data of the RGB-D data are respectively used as the input of the dual-stream network, and the color and depth behavior features are obtained from the two branch networks, and then the fusion features are classified to obtain the human behavior recognition results.

Keywords:	RGB-D image multi-modal features human behavior two-stream network attention mechanism feature fusion
本文献已被维普万方数据等数据库收录！
	点击此处可从《计算机测量与控制》浏览原始摘要信息
	点击此处可从《计算机测量与控制》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏