基于多尺度注意力机制的高分辨率网络人体姿态估计 High resolution network human pose estimation based on multi-scale attention mechanism期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于多尺度注意力机制的高分辨率网络人体姿态估计

引用本文：	李丽,张荣芬,刘宇红,陈娜,张雯雯a.基于多尺度注意力机制的高分辨率网络人体姿态估计[J].计算机应用研究,2022,39(11).

作者姓名：	李丽张荣芬刘宇红陈娜张雯雯a

作者单位：	贵州大学大数据与信息工程学院,贵州大学大数据与信息工程学院,贵州大学大数据与信息工程学院,贵州大学大数据与信息工程学院,贵州大学大数据与信息工程学院

基金项目：	贵州省科学技术基金资助项目(黔科合基础-ZK［2021］重点001)

摘要：	针对人体姿态估计中面对特征图尺度变化的挑战时难以预测人体的正确姿势，提出了一种基于多尺度注意力机制的高分辨率网络MSANet（multiscale-attention net）以提高人体姿态估计的检测精度。引入轻量级的金字塔卷积和注意力特征融合以更高效地完成多尺度信息的提取；在并行子网的融合中引用自转换器模块进行特征增强，获取全局特征；在输出阶段中将各层的特征使用自适应空间特征融合策略进行融合后作为最后的输出，更充分地获取高层特征的语义信息和底层特征的细粒度特征，以推断不可见点和被遮挡的关键点。在公开数据集 COCO2017上进行测试，实验结果表明，该方法比基础网络HRNet的估计精度提升了4.2%。
关键词：	人体姿态估计高分辨率网络多尺度注意力特征融合自适应空间特征融合
收稿时间：	2022/3/4 0:00:00
修稿时间：	2022/10/22 0:00:00
High resolution network human pose estimation based on multi-scale attention mechanism

Li Li,Zhang Rong Fena?,Liu Yu Hong,Chen Naa and Zhang Wen Wena.High resolution network human pose estimation based on multi-scale attention mechanism[J].Application Research of Computers,2022,39(11).

Authors:	Li Li Zhang Rong Fena? Liu Yu Hong Chen Naa and Zhang Wen Wena

Affiliation:	College of Big Data and Information Engineering, Guizhou University,,,,

Abstract:	It is difficult to predict the correct human poses when facing the challenge of the scale change of the feature map in the human pose estimation. To solve this problem, this paper proposed a high-resolution network MSANet(multiscale-attention net) based on multi-scale attention mechanism to improve the detection accuracy of human pose estimation. It introduced lightweight pyramid convolution and attention feature fusion to achieve more efficient extraction of multi-scale information, cited the self-transformer module in the fusion of parallel subnets for feature enhancement to obtain global features. In the output stage, the features of each layer were fused using an adaptive spatial feature fusion strategy as the final output, which more fully obtained the semantic information of high-level features and the fine-grained features of low-level features to infer invisible points and occluded key points. Tested on the public dataset COCO2017, the experimental results show that this method improves the estimation accuracy by 4.2% compared with the basic network HRNet.

Keywords:	human pose estimation high-resolution network multi-scale attention feature fusion adaptive spatial feature fusion

	点击此处可从《计算机应用研究》浏览原始摘要信息
	点击此处可从《计算机应用研究》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏