首页 | 本学科首页   官方微博 | 高级检索  
     

基于YOLO的多模态加权融合行人检测算法
引用本文:施政,毛力,孙俊.基于YOLO的多模态加权融合行人检测算法[J].计算机工程,2021,47(8):234-242.
作者姓名:施政  毛力  孙俊
作者单位:江南大学 人工智能与计算机学院,江苏 无锡 214122
基金项目:国家自然科学基金(61672263)。
摘    要:在夜间光照不足、目标被遮挡导致信息缺失以及行人目标多尺度的情况下,可见光单模态行人检测算法的检测效果较差。为了提高行人检测器的鲁棒性,基于YOLO提出一种可见光与红外光融合的行人检测算法。使用Darknet53作为特征提取网络,分别提取2个模态的多尺度特征。对传统多模态行人检测算法所使用的concat融合方式进行改进,设计结合注意力机制的模态加权融合层,以加强对融合特征图的模态选择。在此基础上,使用多尺度的融合特征进行行人检测。实验结果表明,模态加权融合较concat融合有较大的精度提升,且该算法在夜间光照不足、目标遮挡和目标多尺度情况下检测效果良好,在KAIST数据集上的检测精度优于HalFusion和Fusion RPN+BDT等算法,检测速度也有较大提升。

关 键 词:行人检测  目标检测  多模态算法  YOLO网络  注意力机制
收稿时间:2020-06-24
修稿时间:2020-08-11

YOLO-Based Multi-Modal Weighted Fusion Pedestrian Detection Algorithm
SHI Zheng,MAO Li,SUN Jun.YOLO-Based Multi-Modal Weighted Fusion Pedestrian Detection Algorithm[J].Computer Engineering,2021,47(8):234-242.
Authors:SHI Zheng  MAO Li  SUN Jun
Affiliation:School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, Jiangsu 214122, China
Abstract:The performance of single-modal pedestrian detection algorithms based on visible images is limited in the cases of insufficient light at night, lack of information caused by target occlusion, and multi-scale targets. In order to improve the robustness of pedestrian detectors, a YOLO-based pedestrian detection algorithm that combines visible light and infrared light is proposed. By taking Darknet53 as the feature extraction network, the multi-scale features of visible and infrared modalities are extracted. To improve the concat fusion method used by the existing multi-modal pedestrian detection algorithms, a modal weighted fusion layer combined with an attention mechanism is designed to strengthen the modal selection of the fusion feature map. On this basis, the multi-scale fusion features are used for pedestrian detection. Experimental results show that modal weighted fusion significantly improves the accuracy of concat fusion. The proposed algorithm displays excellent detection performance under the conditions of insufficient light at night, target occlusion and multi-scale targets, providing higher detection accuracy and speed than HalFusion, Fusion RPN+BDT and other algorithms on the KAIST dataset.
Keywords:pedestrian detection  target detection  multi-modal algorithm  YOLO network  attention mechanism  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机工程》浏览原始摘要信息
点击此处可从《计算机工程》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号