基于多级特征和混合注意力机制的室内人群检测网络 Indoor crowd detection network based on multi-level features and hybrid attention mechanism期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于多级特征和混合注意力机制的室内人群检测网络

引用本文：	沈文祥,秦品乐,曾建潮.基于多级特征和混合注意力机制的室内人群检测网络[J].计算机应用,2019,39(12):3496-3502.

作者姓名：	沈文祥秦品乐曾建潮

作者单位：	中北大学大数据学院,太原030051;中北大学大数据学院,太原030051;中北大学大数据学院,太原030051

基金项目：	山西省重点研发计划项目（201803D31212-1）。

摘要：	针对室内人群目标尺度和姿态多样性、人头目标易与周围物体特征混淆的问题，提出了一种基于多级特征和混合注意力机制的室内人群检测网络（MFANet）。该网络结构包括三部分，即特征融合模块、多尺度空洞卷积金字塔特征分解模块以及混合注意力模块。首先，通过将浅层特征和中间层特征信息融合，形成包含上下文信息的融合特征，用于解决浅层特征图中小目标语义信息不丰富、分类能力弱的问题；然后，利用空洞卷积增大感受野而不增加参数的特性，对融合特征进行多尺度分解，形成新的小目标检测分支，实现网络对多尺度目标的定位和检测；最后，用局部混合注意力模块来融合全局像素关联空间注意力和通道注意力，增强对关键信息贡献大的特征，来增强网络对目标和背景的区分能力。实验结果表明，所提方法在室内监控场景数据集SCUT-HEAD上达到了0.94的准确率、0.91的召回率和0.92的F1分数，在召回率、准确率和F1指标上均明显优于当前用于室内人群检测的其他算法。
关键词：	室内人群检测特征融合注意力机制空洞卷积特征金字塔
收稿时间：	2019-06-24
修稿时间：	2019-09-19
Indoor crowd detection network based on multi-level features and hybrid attention mechanism

SHEN Wenxiang,QIN Pinle,ZENG Jianchao.Indoor crowd detection network based on multi-level features and hybrid attention mechanism[J].journal of Computer Applications,2019,39(12):3496-3502.

Authors:	SHEN Wenxiang QIN Pinle ZENG Jianchao

Affiliation:	College of Big Data, North University of China, Taiyuan Shanxi 030051, China

Abstract:	In order to solve the problem of indoor crowd target scale and attitude diversity and confusion of head targets with surrounding objects, a new Network based on Multi-level Features and hybrid Attention mechanism for indoor crowd detection (MFANet) was proposed. It is composed of three parts:feature fusion module, multi-scale dilated convolution pyramid feature decomposition module, and hybrid attention module. Firstly, by combining the information of shallow features and intermediate layer features, a fusion feature containing context information was formed to solve the problem of the lack of semantic information and the weakness of classification ability of the small targets in the shallow feature map. Then, with the characteristics of increasing the receptive field without increasing the parameters, the dilated convolution was used to perform the multi-scale decomposition on the fusion features to form a new small target detection branch, realizing the positioning and detection of the multi-scale targets by the network. Finally, the local fusion attention module was used to integrate the global pixel correlation space attention and channel attention to enhance the features with large contribution on the key information in order to improve the ability of distinguishing target from background. The experimental results show that the proposed method achieves an accuracy of 0.94, a recall rate of 0.91 and an F1 score of 0.92 on the indoor monitoring scene dataset SCUT-HEAD. All of these three are significantly better than those of other algorithms currently used for indoor crowd detection.

Keywords:	indoor crowd detection feature fusion attention mechanism dilate convolution feature pyramid
本文献已被万方数据等数据库收录！
	点击此处可从《计算机应用》浏览原始摘要信息
	点击此处可从《计算机应用》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏