首页 | 本学科首页   官方微博 | 高级检索  
     

GP-WIRGAN:梯度惩罚优化的Wasserstein图像循环生成对抗网络模型
引用本文:冯永,张春平,强保华,张逸扬,尚家兴. GP-WIRGAN:梯度惩罚优化的Wasserstein图像循环生成对抗网络模型[J]. 计算机学报, 2020, 43(2): 190-205
作者姓名:冯永  张春平  强保华  张逸扬  尚家兴
作者单位:重庆大学计算机学院 重庆400044;重庆大学信息物理社会可信服务计算教育部重点实验室 重庆400044;桂林电子科技大学广西可信软件重点实验室 广西桂林541004;桂林电子科技大学广西光电信息处理重点实验室培育基地 广西桂林541004
基金项目:国家自然科学基金(61762025);国家重点研发计划(2017YFB1402400);重庆市基础与前沿研究计划(cstc2017jcyjAX0340);广西可信软件重点实验室开放课题(kx201701);广西光电信息处理重点实验室(培育基地)基金(GD18202);重庆市重点产业共性关键技术创新专项(cstc2017zdcy-zdyxx0047);重庆市社会事业与民生保障科技创新专项(cstc2017shmsA20013)资助.
摘    要:通常情形下,现有的图像生成模型都采用单次前向传播的方式生成图像,但实际中,画家通常是反复修改后才完成一幅画作的;生成对抗模型(Generative Adversarial Networks,GAN)能生成图像,但却很难训练.在保证生成图像质量的前提下,效仿作画时的不断更新迭代,以提升生成样本多样性并增强样本语义,同时引入Wasserstein距离,提出了Wasserstein图像循环生成对抗网络模型,简称WIRGAN(Wasserstein Image Recurrent Generative Adversarial Networks Model).WIRGAN定义了生成模型和判别模型,其中,生成模型是由一系列结构相同的神经网络模型组成的循环结构,用时间步骤T控制生成模型的循环次数,用于迭代式生成图像,并以最后一个循环结构的生成图像作为整个生成模型的输出;判别模型也由神经网络构建,结合权重剪枝技术,用来判别输入图像是生成的还是真实的.WIRGAN利用Wasserstein距离作为目标函数,将生成模型和判别模型进行博弈对抗训练.另外,由于模型存在难以优化的问题,本文引入了梯度惩罚来解决此类问题,进一步提出了梯度惩罚优化的Wasserstein图像循环生成对抗网络模型(Gradient Penalty Optimized Wasserstein Image Recurrent Generative Adversarial Networks Model,GP-WIRGAN).最后,WIRGAN和GP-WIRGAN在MNIST、CIFAR10、CeUN四个数据集上进行了基础学习能力、模型间GAM自比较、模型内GAM自比较、初始得分比较、图像生成可视化、时间效率比较等6组实验,采用生成对抗矩阵(Generative Adversarial Metric,GAM)和起始分数(Inception Scores)进行评估,结果表明,本文提出的WIRGAN、GP-WIRGAN具有良好的稳定性,可以生成高质量的图像.

关 键 词:图像生成  生成对抗网络  Wasserstein距离  深度学习  权重剪枝  梯度惩罚

GP-WIRGAN:A Novel Image Recurrent Generative Adversarial Network Model Based on Wasserstein and Gradient Penalty
FENG Yong,ZHANG Chun-Ping,QIANG Bao-Hua,ZHANG Yi-Yang,SHANG Jia-Xing. GP-WIRGAN:A Novel Image Recurrent Generative Adversarial Network Model Based on Wasserstein and Gradient Penalty[J]. Chinese Journal of Computers, 2020, 43(2): 190-205
Authors:FENG Yong  ZHANG Chun-Ping  QIANG Bao-Hua  ZHANG Yi-Yang  SHANG Jia-Xing
Affiliation:(College of Computer Science,Chongqing University,Chongqing 400030;Key Laboratory of Dependable Service Computing in Cyber Physical Society,Ministry of Education,Chongqing University,Chongqing 400030;Guangxi Key Laboratory of Trusted Software,Guilin University of Electronic Technology,Guilin,Guangxi 541004;Guangxi Key Laboratory of Optoelectronic Information Processing,Guilin University of Electronic Technology,Guilin,Guangxi 541004)
Abstract:Most image generation models use a one-time image generation method,which obtains output through a single forward of generation model.But in practice,for example,painters usually repeatedly modify their paintings from coarse to fine during their creation time,which is a multi-stage process.Generative model reduces the manual marking requirements on image data,and can understand semantic meaning of the images well.The generative model can synthesize approximate real data from its learned data distribution.One of the main stream generative model is called Generative Adversarial Network(GAN).By utilizing game theory and deep learning,we can ultimately synthesize high-grade data samples based on two types of networks called generator and discriminator inside GAN model.GAN is well known for generating images,but has difficulty in training stably due to the irrational distance metric in optimizing target,which results in poorly generated sample diversity.Besides,most generative models generate images at a single cycle,but in fact,when the painter paints,he completes a painting on the basis of previous modifications.In order to guarantee the quality of the generated image and enhance the generation of sample diversity and the semantics of the sample,we simulate the process of repeating iterations and multiple modifications by the artist during painting,and generate samples using method we called“multi-generation”.We chose Wasserstein distance to measure the distance between the real data distribution and the generated data distribution,proposed a framework named Wasserstein Image Recurrent Generative Adversarial Networks(WIRGAN).WIRGAN defines a generative model and a discriminative model,the generative model is used to gradually generate images,which consists of a recurrent feedback loop structure and can handle a time step parameter T of generation to control the complexity of model.Sample generated at time t is combined with the output of time t-1 by simply adding together,the generator takes the image generated from the last time step as output.The discriminator model is also constructed by a neural network,combining weight clipping to determine whether the input image is generated or true.WIRGAN uses Wasserstein distance as cost function,which aims to decrease the discrepancy between synthesized samples and real samples,training WIRGAN in an adversarial way.In addition,gradient penalty is also used in this paper to deal training difficulty that produced by weight clipping in WIRGAN.We further propose a Gradient Penalty Optimized Wasserstein Image Recurrent Generative Adversarial Networks Model(GP-WIRGAN).Finally,we adopt Generative Adversarial Metric(GAM)and inception score to evaluate the performance of our models on the quality and diversity of the generated samples.WIRGAN and GP-WIRGAN conducted five sets of comparative experiments on four datasets including MNIST,CIFAR10,CelebA and LSUN,which are the basic learning abilities comparison,the GAM comparisons within the model,the GAM comparisons between the models,the inception score comparisons,visualization,Time efficiency comparison.Extensive experiments show the proposed model has achieved good results in both evaluation criteria,which identify that WIRGAN and GP-WIRGAN has good stability and can generate high quantity images.
Keywords:image generating  generative adversarial networks  Wasserstein distance  deep learning  weight clipping  gradient penalty
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号