基于条件对抗网络的单幅霾图像深度估计模型 Depth estimation model of single haze image based on conditional generative adversarial network期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于条件对抗网络的单幅霾图像深度估计模型

引用本文：	张文涛,王园宇,李赛泽.基于条件对抗网络的单幅霾图像深度估计模型[J].计算机应用,2022,42(9):2865-2875.

作者姓名：	张文涛王园宇李赛泽

作者单位：	太原理工大学信息与计算机学院，山西晋中 030600

基金项目：	山西省自然科学基金资助项目(201801D121142);山西省回国留学人员科研资助项目

摘要：	针对霾环境中图像降质导致的传统深度估计模型退化问题，提出了一种融合双注意力机制的基于条件生成对抗网络（CGAN）的单幅霾图像深度估计模型。首先，对于模型的生成器的网络结构，提出了融合双注意力机制的DenseUnet结构，其中DenseUnet将密集块作为U-net编码和解码过程中的基本模块，并利用密集连接和跳跃连接在加强信息流动的同时，提取直接传输率图的底层结构特征和高级深度信息。然后，通过双注意力模块自适应地调整空间特征和通道特征的全局依赖关系，同时将最小绝对值损失、感知损失、梯度损失和对抗损失融合为新的结构保持损失函数。最后，将霾图像的直接传输率图作为CGAN的条件，通过生成器和鉴别器的对抗学习估计出霾图像的深度图。在室内数据集NYU Depth v2和室外数据集DIODE上进行训练和测试。实验结果表明，该模型具有更精细的几何结构和更丰富的局部细节。在NYU Depth v2上，与全卷积残差网络相比，对数平均误差（LME）和均方根误差（RMSE）分别降低了7%和10%；在DIODE上，与深度有序回归网络相比，精确度（阈值小于1.25）提高了7.6%。可见，所提模型提高了在霾干扰下深度估计的准确性和泛化能力。
关键词：	深度估计霾图像注意力机制梯度损失条件生成对抗网络直接传输率图
收稿时间：	2021-08-03
修稿时间：	2021-11-22
Depth estimation model of single haze image based on conditional generative adversarial network

Wentao ZHANG,Yuanyu WANG,Saize LI.Depth estimation model of single haze image based on conditional generative adversarial network[J].journal of Computer Applications,2022,42(9):2865-2875.

Authors:	Wentao ZHANG Yuanyu WANG Saize LI

Affiliation:	College of Information and Computer，Taiyuan University of Technology，Jinzhong Shanxi 030600，China

Abstract:	To address the degradation problem of traditional depth estimation models caused by image quality degradation in haze environment， a model based on Conditional Generative Adversarial Network （CGAN） was proposed to estimate the depth of single haze image by fusing dual attention mechanism. Firstly， for the network structure of the generator of the model， the DenseUnet structure fused with dual attention mechanism was proposed. The dense blocks were used as basic blocks in the encoding and decoding processes of U-net. Dense and jump connections were used to enhance information flow， as well as extract the underlying structural features and high-level depth information of the direct transmission rate map. Then， the global dependencies of spatial features and channel features were adaptively adjusted by the dual attention module. At the same time， a new structure-preserving loss function was proposed by combining the least absolute value function， perceptual loss， gradient loss， and adversarial loss. Finally， using the direct transmission rate map of the haze image as a condition of CGAN， the depth map of the haze image was estimated through the adversarial learning of the generator and the discriminator. Training and testing were performed on the indoor dataset NYU Depth v2 and the outdoor dataset DIODE. Experimental results show that the proposed model has a finer geometric structure and richer local details. Compared with the fully convolutional residual network， on NYU Depth v2， the proposed model has the Logarithmic Mean Error （LME） and Root Mean Square Error （RMSE） error reduced by 7% and 10%， respectively. Compared with the deep ordinal regression network， on DIODE， the proposed model has the accuracy with threshold less than 1.25 increased by 7.6%. It can be seen that the proposed model improves the estimation accuracy and generalization ability of depth estimation under the interference of haze.

Keywords:	depth estimation haze image attention mechanism gradient loss Conditional Generative Adversarial Network (CGAN) direct transmission rate map

	点击此处可从《计算机应用》浏览原始摘要信息
	点击此处可从《计算机应用》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏