首页 | 本学科首页   官方微博 | 高级检索  
     

改进YOLOv7的视频监控小目标检测
引用本文:夏翔,朱明. 改进YOLOv7的视频监控小目标检测[J]. 计算机系统应用, 2024, 33(7): 52-62
作者姓名:夏翔  朱明
作者单位:中国科学技术大学 信息科学技术学院, 合肥 230026
基金项目:科技创新特区计划(20-163-14-LZ-001-004-01)
摘    要:小目标检测作为目标检测中一项极具挑战性的项目, 广泛分布于日常生活中, 在视频监控场景中, 距离摄像头约20 m远处的行人人脸就可以被认为是小目标. 由于人脸可能相互遮挡并容易受到噪声和天气光照条件的影响, 现有的目标检测模型在这类小目标上的性能劣于中大型目标. 针对此类问题, 本文提出了改进后的YOLOv7模型, 添加了高分辨率检测头, 并基于GhostNetV2对骨干网络进行了改造; 同时基于BiFPN和SA注意力模块替换PANet结构, 增强多尺度特征融合能力; 结合Wasserstein距离改进了原来的CIoU损失函数, 降低了小目标对锚框位置偏移的敏感性. 本文在公开数据集VisDrone2019以及自制的视频监控数据集上进行了对比实验. 实验表明, 本文提出的改进方法mAP指标在VisDrone2019数据集上提高到了50.1%, 在自制视频监控数据集上高于现有方法1.6个百分点, 有效提高了小目标检测的能力, 并在GTX1080Ti上达到了较好的实时性.

关 键 词:小目标检测  注意力机制  特征融合  损失函数
收稿时间:2023-12-14
修稿时间:2024-01-17

Small Target Detection in Video Surveillance Based on Improved YOLOv7
XIA Xiang,ZHU Ming. Small Target Detection in Video Surveillance Based on Improved YOLOv7[J]. Computer Systems& Applications, 2024, 33(7): 52-62
Authors:XIA Xiang  ZHU Ming
Affiliation:School of Information Science and Technology, University of Science and Technology of China, Hefei 230026, China
Abstract:As a very challenging project in target detection, small target detection is widely distributed in daily life. In video surveillance scenarios, pedestrians’ faces about 20 meters away from the camera can be considered small targets. Due to the possibility of mutual occlusion of faces and their susceptibility to noise and weather, lighting conditions, the performance of existing target detection models on such small targets is inferior to that on medium and large targets. To address these issues, this study proposes an improved YOLOv7 model with a high-resolution detection head and transforms the backbone network based on GhostNetV2. At the same time, the PANet structure is replaced by the BiFPN and SA attention modules combined to enhance the multi-scale feature fusion capability; the original CIoU loss function is improved by combining the Wasserstein distance, reducing the sensitivity of small targets to anchor frame position offset. This study conducts comparative experiments on the public dataset VisDrone2019 and a self-made video surveillance dataset. Results show that the mAP of the improved method proposed in this study improved to 50.1% on the VisDrone2019 dataset and is 1.6 percentage points higher than existing methods on the self-made video surveillance dataset, which effectively improves the ability of small target detection and achieves good real-time performance on the GTX1080Ti.
Keywords:small target detection  attention mechanism  feature fusion  loss function
点击此处可从《计算机系统应用》浏览原始摘要信息
点击此处可从《计算机系统应用》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号