首页 | 本学科首页   官方微博 | 高级检索  
     

基于生成对抗双网络的虚拟到真实驾驶场景的视频翻译模型
引用本文:刘士豪,胡学敏,姜博厚,张若晗,孔力. 基于生成对抗双网络的虚拟到真实驾驶场景的视频翻译模型[J]. 计算机应用, 2020, 40(6): 1621-1626. DOI: 10.11772/j.issn.1001-9081.2019101802
作者姓名:刘士豪  胡学敏  姜博厚  张若晗  孔力
作者单位:湖北大学 计算机与信息工程学院,武汉 430062
基金项目:国家自然科学基金资助项目(61806076);湖北省自然科学基金资助项目(2018CFB158);湖北省大学生创新创业训练计划项目(S201910512026);湖北大学楚才学院大学生科学研究项目(20182211006)。
摘    要:针对虚拟到真实驾驶场景翻译中成对的数据样本缺乏以及前后帧不一致等问题,提出一种基于生成对抗网络的视频翻译模型。为解决数据样本缺乏问题,模型采取"双网络"架构,将语义分割场景作为中间过渡分别构建前、后端网络。在前端网络中,采用卷积和反卷积框架,并利用光流网络提取前后帧的动态信息,实现从虚拟场景到语义分割场景的连续的视频翻译;在后端网络中,采用条件生成对抗网络框架,设计生成器、图像判别器和视频判别器,并结合光流网络,实现从语义分割场景到真实场景的连续的视频翻译。实验利用从自动驾驶模拟器采集的数据与公开数据集进行训练和测试,在多种驾驶场景中能够实现虚拟到真实场景的翻译,翻译效果明显好于对比算法。结果表明,所提模型能够有效解决前后帧不连续和动态目标模糊的问题,使翻译的视频更为流畅,并且能适应多种复杂的驾驶场景。

关 键 词:虚拟到真实  视频翻译  生成对抗网络  光流网络  驾驶场景
收稿时间:2019-10-24
修稿时间:2019-12-11

Video translation model from virtual to real driving scenes based on generative adversarial dual networks
LIU Shihao,HU Xuemin,JIANG Bohou,ZHANG Ruohan,KONG Li. Video translation model from virtual to real driving scenes based on generative adversarial dual networks[J]. Journal of Computer Applications, 2020, 40(6): 1621-1626. DOI: 10.11772/j.issn.1001-9081.2019101802
Authors:LIU Shihao  HU Xuemin  JIANG Bohou  ZHANG Ruohan  KONG Li
Affiliation:School of Computer Science and Information Engineering, Hubei University, Wuhan Hubei 430062, China
Abstract:To handle the issues of lacking paired training samples and inconsistency between frames in translation from virtual to real driving scenes, a video translation model based on Generative Adversarial Networks was proposed in this paper. In order to solve the problem of lacking data samples, the model adopted a “dual networks” architecture, where the semantic segmentation scene was used as an intermediate transition to build front-part and back-part networks, respectively. In the front-part network, a convolution network and a deconvolution network were adopted, and the optical flow network was also used to extract the dynamic information between frames to implement continuous video translation from virtual to semantic segmentation scenes. In the back-part network, a conditional generative adversarial network was used in which a generator, an image discriminator and a video discriminator were designed and combined with the optical flow network to implement continuous video translation from semantic segmentation to real scenes. Data collected from an autonomous driving simulator and a public data set were used for training and testing. Virtual to real scene translation can be achieved in a variety of driving scenarios, and the translation effect is significantly better than the comparative algorithms. Experimental results show that the proposed model can handle the problems of the discontinuity between frames and the ambiguity for moving obstacles to obtain more continuous videos when applying in various driving scenarios.
Keywords:virtual to real   video translation   Generative Adversarial Networks (GAN)   optical flow network   driving scene
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号