首页 | 本学科首页   官方微博 | 高级检索  
     

面向视频数据的深度学习目标识别算法综述
引用本文:王振华,李静,张鑫月,郑宗生,卢鹏,栾奎峰.面向视频数据的深度学习目标识别算法综述[J].计算机工程,2022,48(4):1-15.
作者姓名:王振华  李静  张鑫月  郑宗生  卢鹏  栾奎峰
作者单位:1. 上海海洋大学 信息学院, 上海 201306;2. 上海海洋大学 海洋科学学院, 上海 201306
基金项目:国家自然科学基金(61972240);
摘    要:目标识别是计算机视觉领域的一大挑战,随着深度学习的发展,目标识别算法被广泛应用于视频数据中目标的识别和监测。对现有目标识别算法进行归纳,根据是否采用锚点机制将主流算法分为Anchor-Based和Anchor-Free两大类。针对R-CNN、SPP-Net、SSD、YOLOv2等Anchor-Based类目标识别算法,从候选框创建、特征提取和结果生成角度分析基于区域和基于回归的目标识别算法的区别和各自优势。针对CornerNet、ExtremeNet、CenterNet、FCOS等Anchor-Free类目标识别算法,从特征提取、关键点选择/层次结构和结果生成角度分析基于关键点和基于特征金字塔的目标识别算法的区别和各自优势。在此基础上,以识别效率和识别精度为评价指标,对Faster R-CNN、Mask R-CNN、SSD等8种代表性目标识别算法进行对比总结。最后,针对目标识别算法中的数据预处理耗时长、多尺度特征同步识别精度低、结构繁杂等问题,对当前研究的不足和未来研究方向进行分析和展望。

关 键 词:深度学习  目标识别  锚定框  候选区域  关键点  视频数据  
收稿时间:2021-07-30
修稿时间:2021-10-23

Survey of Target Recognition Algorithms for Video Data Using Deep Learning
WANG Zhenghua,LI Jing,ZHANG Xinyue,ZHENG Zongsheng,LU Peng,LUAN Kuifeng.Survey of Target Recognition Algorithms for Video Data Using Deep Learning[J].Computer Engineering,2022,48(4):1-15.
Authors:WANG Zhenghua  LI Jing  ZHANG Xinyue  ZHENG Zongsheng  LU Peng  LUAN Kuifeng
Affiliation:1. College of Information, Shanghai Ocean University, Shanghai 201306, China;2. College of Marine Sciences, Shanghai Ocean University, Shanghai 201306, China
Abstract:Target recognition is a big challenge in the field of computer vision.With the development of deep learning, target recognition algorithms are widely used to monitor video data.The existing target recognition algorithms can be summarized based on the existence of the anchor mechanism such that target recognition algorithms are divided into Anchor-Based and Anchor-Free.For Anchor-Based target recognition algorithms, such as R-CNN, SPP Net, SSD and YOLOv2, the differences and respective advantages of region-based and regression-based target recognition algorithms are analyzed from the perspective of creating candidate boxes, feature extraction, and result generation.In contrast, for Anchor-Free target recognition algorithms, such as CornerNet ExtremeNet, CenterNet, and FCOS, the differences and respective advantages of key point-based and feature pyramid-based target recognition algorithms are analyzed from the perspectives of feature extraction, key point selection/hierarchy and result generation.This study compares and summarizes eight representative target recognition algorithms, Fast R-CNN, Mask R-CNN and SSD, to name a few, with recognition efficiency and recognition accuracy as evaluation indices.At last, to address the problems of long computation time in data preprocessing, low accuracy of multi-scale feature synchronous recognition, and the complex structure of target recognition algorithms, which are the shortcomings of the current research, future prospects and research directions in analysis are suggested.
Keywords:deep learning  object recognition  anchor box  region proposal  key point  video data  
点击此处可从《计算机工程》浏览原始摘要信息
点击此处可从《计算机工程》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号