首页 | 本学科首页   官方微博 | 高级检索  
     

TCSNGAN:基于Transformer和谱归一化CNN的图像生成模型
引用本文:钱惠敏,毛邱凌,陈实,韩怡星,吕本杰. TCSNGAN:基于Transformer和谱归一化CNN的图像生成模型[J]. 计算机应用研究, 2024, 41(4): 1221-1227
作者姓名:钱惠敏  毛邱凌  陈实  韩怡星  吕本杰
作者单位:河海大学人工智能与自动化学院
摘    要:生成对抗网络(generative adversarial network, GAN)已成为图像生成问题中常用的模型之一,但是GAN的判别器在训练过程中易出现梯度消失而导致训练不稳定,以致无法获得最优化的GAN而影响生成图像的质量。针对该问题,设计满足Lipschitz条件的谱归一化卷积神经网络(CNN with spectral normalization, CSN)作为判别器,并采用具有更强表达能力的Transformer作为生成器,由此提出图像生成模型TCSNGAN。CSN判别器网络结构简单,解决了GAN模型的训练不稳定问题,且能依据数据集的图像分辨率配置可调节的CSN模块数,以使模型达到最佳性能。在公共数据集CIFAR-10和STL-10上的实验结果表明,TCSNGAN模型复杂度低,生成的图像质量优;在火灾图像生成中的实验结果表明,TCSNGAN可有效解决小样本数据集的扩充问题。

关 键 词:生成对抗网络  图像生成  Transformer  Lipschitz判别器
收稿时间:2023-07-25
修稿时间:2024-03-12

TCSNGAN: image generation model based on Transformer and CNN with spectral normalization
Qian Huimin,Mao Qiuling,Chen Shi,Han Yixing and Lv Benjie. TCSNGAN: image generation model based on Transformer and CNN with spectral normalization[J]. Application Research of Computers, 2024, 41(4): 1221-1227
Authors:Qian Huimin  Mao Qiuling  Chen Shi  Han Yixing  Lv Benjie
Affiliation:Hohai University,,,,
Abstract:GAN has become one of the commonly-used image generation models. However, the discriminator of GAN is prone to the vanishing gradient problem in the training process, which leads to the instability of training. So that it is difficult to obtain the optimal GAN, and the quality of generation image is poor. To solve this problem, it designed a CNN with spectral normalization which satisfied the Lipchitz condition as the discriminator. Together with the Transformer generator, this paper proposed an image generation model, namely TCSNGAN(Transformer CSN GAN). The presented CSN discriminator satisfied Lipschitz condition and could solve the problem of training instability. The network structure of discriminator was simple, and the number of CSN modules was adjustable to achieve the optimal configuration according to different data sets. Experiments on public datasets CIFAR-10 and STL-10 show that the proposed TCSNGAN model has low complexity, and the generated image quality is good. And the experiments of fire image generation task demonstrates the effectiveness of small-sample dataset augmentation.
Keywords:generative adversarial networks   image generation   Transformer   Lipschitz discriminator
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号