语义区域风格约束下的图像合成 Image Synthesis with Semantic Region Style Constraint期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

语义区域风格约束下的图像合成

引用本文：	胡妤婕,常建慧,张健.语义区域风格约束下的图像合成[J].计算机科学,2021,48(2):134-141.

作者姓名：	胡妤婕常建慧张健

作者单位：	北京大学深圳研究生院广东深圳 518055;北京大学深圳研究生院广东深圳 518055;北京大学深圳研究生院广东深圳 518055

基金项目：	深圳科技研发项目;国家自然科学基金

摘要：	生成对抗网络近年来发展迅速,其中语义区域分割与生成模型的结合为图像生成技术研究提供了新方向。在当前的研究中,语义信息作为指导生成的条件,可以通过编辑和控制输入的语义分割掩码来生成理想的特定风格图像。文中提出了一种具有语义区域风格约束的图像生成框架,利用条件对抗生成网络实现了图像分区域的自适应风格控制。具体而言,首先获得图像的语义分割图,并使用风格编码器提取出图像中不同语义区域的风格信息;然后,在生成端将风格信息和语义掩码对应生成器中的每个残差块分别仿射变换为两组调制参数;最后,输入到生成器中的语义特征图根据每个残差块的调制参数加权求和,并通过卷积与上采样渐进式地生成目标风格内容,从而有效地将语义信息和风格信息相结合,得到最终的目标风格内容。针对现有模型难以精准控制各语义区域风格的问题,文中设计了新的风格约束损失,在语义层次上约束区域风格变化,减小不同语义区域的风格编码之间的相互影响;另外,在不影响性能的前提下,采取权重量化的方式,将生成器的参数存储规模压缩为原来的15.6%,有效降低了模型的存储空间消耗。实验结果表明,所提模型的生成质量在主观感受和客观指标上较现有方法均有显著提高,其中FID分数比当前最优模型提升了约3.8%。
关键词：	条件生成模型自适应归一化图像生成生成对抗网络深度学习
Image Synthesis with Semantic Region Style Constraint

HU Yu-jie,CHANG Jian-hui,ZHANG Jian.Image Synthesis with Semantic Region Style Constraint[J].Computer Science,2021,48(2):134-141.

Authors:	HU Yu-jie CHANG Jian-hui ZHANG Jian

Affiliation:	(Shenzhen Graduate School,Peking University,Shenzhen,Guangdong 518055,China)

Abstract:	In recent years,generative adversarial networks have developed rapidly,and image synthesis has become an active research direction.Especially,the combination of semantic region segmentation and generative models provides a new insight for image synthesis.Semantic information can be used to edit and control the input semantic segmentation mask to generate the ideal image with a specific style to generate the desired realistic image.However,the current technology cannot achieve the precise control of the style content of each semantic area.This paper proposes a novel framework for image synthesis under semantic region style constraint,and realizes the adaptive style control of per region using conditional generation model.First of all,a style encoder is used to extract the style information of different semantic regions from the semantic segmentation mask obtained.Then at the generation end,the style information and semantic mask are affine transformed into two sets of modulation parameters respectively for each residual block by using adaptive normalization.The semantic feature map input into the generator is weighted sum according to the modulation parameters,which can effectively combine the semantic information and style information,and generate the target style content gradually through convolution and up-sampling.In the end,this paper designs a new style constraint loss function to constrain the change between per-region style at the semantic level,and to reduce the mutual influence between different semantic style code,aiming at the problem that the existing model cannot accurately control the style of each semantic area.In addition,this paper adopts the method of quantifying weights to compress the generator by about 15.6%,effectively reducing the storage size of the model and the network space without performance degradation.The experimental results show that the proposed model has significantly improved both perceptually and quantitively compared to existing methods,where the FID score is about 3.8% higher than the state-of-the-arts model.

Keywords:	Conditional generative model Adaptive normalization Image synthesis Generative adversarial networks Deep learning
本文献已被维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏