首页 | 本学科首页   官方微博 | 高级检索  
     

深浅层表示融合的半监督视频目标分割
引用本文:吕潇,宋慧慧,樊佳庆.深浅层表示融合的半监督视频目标分割[J].计算机应用,2022,42(12):3884-3890.
作者姓名:吕潇  宋慧慧  樊佳庆
作者单位:江苏省大数据分析技术重点实验室(南京信息工程大学), 南京 210044
江苏省大气环境与装备技术协同创新中心(南京信息工程大学), 南京 210044
基金项目:国家自然科学基金资助项目(61872189);江苏省自然科学基金资助项目(BK20191397)
摘    要:为了解决半监督视频目标分割任务中,分割精度与分割速度难以兼顾以及无法对视频中与前景相似的背景目标做出有效区分的问题,提出一种基于深浅层特征融合的半监督视频目标分割算法。首先,利用预先生成的粗糙掩膜对图像特征进行处理,以获取更鲁棒的特征;然后,通过注意力模型提取深层语义信息;最后,将深层语义信息与浅层位置信息进行融合,从而得到更加精确的分割结果。在多个流行的数据集上进行了实验,实验结果表明:在分割运行速度基本不变的情况下,所提算法在DAVIS 2016数据集上的雅卡尔(J)指标相较于学习快速鲁棒目标模型的视频目标分割(FRTM)算法提高了1.8个百分点,综合评价指标为JF得分的均值J&F相较于FRTM提高了2.3个百分点;同时,在DAVIS 2017数据集上,所提算法的J指标比FRTM提升了1.2个百分点,综合评价指标J&F比FRTM提升了1.1个百分点。以上结果充分说明所提算法能够在保持较快分割速度的情况下实现更高的分割精度,并且能够有效区别相似的前景与背景目标,具有较强的鲁棒性。可见所提算法在平衡速度与精度以及有效区分前景背景方面的优越性能。

关 键 词:视频目标分割  注意力  融合  深层语义信息  浅层位置信息  
收稿时间:2021-09-17
修稿时间:2022-01-11

Semi-supervised video object segmentation via deep and shallow representations fusion
Xiao LYU,Huihui SONG,Jiaqing FAN.Semi-supervised video object segmentation via deep and shallow representations fusion[J].journal of Computer Applications,2022,42(12):3884-3890.
Authors:Xiao LYU  Huihui SONG  Jiaqing FAN
Affiliation:Jiangsu Key Laboratory of Big Data Analysis Technology (Nanjing University of Information Science and Technology),Nanjing Jiangsu 210044,China
Jiangsu Collaborative Innovation Center on Atmospheric Environment and Equipment Technology (Nanjing University of Information Science and Technology),Nanjing Jiangsu 210044,China
Abstract:In order to solve the problems that the segmentation accuracy and speed are difficult to balance and the algorithm cannot effectively distinguish similar foreground and background objects in the task of semi-supervised video object segmentation, a semi-supervised video object segmentation algorithm was proposed on the basis of deep and shallow feature fusion. Firstly, a pre-generated rough mask was used to process image features, thereby achieving more robust features. Secondly, deep semantic information was extracted by the attention model. Finally, deep semantic information and shallow position information were fused to obtain more accurate segmentation results. Experiments were conducted on multiple popular datasets. The experiment results demonstrate that the proposed algorithm improves the Jaccard (J) index by 1.8 percentage points and improves the comprehensive evaluation index mean of J and F?score J&F by 2.3 percentage points compared with Learning Fast and Robust Target Models for Video Object Segmentation (FRTM) algorithm on DAVIS 2016 dataset. Meanwhile, on DAVIS 2017 dataset, the proposed algorithm improves J index by 1.2 percentage points and improves the comprehensive evaluation index J&F by 1.1 percentage points compared with FRTM algorithm. The above results fully prove that the proposed algorithm can achieve higher segmentation accuracy with fast speed, and effectively distinguish background and foreground objects with strong robustness. It can be seen that the proposed algorithm has superior performance in balancing speed and accuracy and effectively distinguishing foreground and background.
Keywords:video object segmentation  attention  fusion  deep semantic information  shallow position information  
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号