Hybrid Attention Driven Text-to-image Synthesis via Generative Adversarial Networks

机译：通过生成对抗网络的混合注意力驱动文本到图像合成

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

With the development of generative models, image synthesis conditioned on the specific variable becomes an important research theme gradually. This paper presents a novel spectral normalization based Hybrid Attentional Generative Adversarial Networks (HAGAN) for text to image synthesis. The hybrid attentional mechanism is composed of text-image cross-modal attention and self-attention of image sub regions. Cross-modal attention mechanism contributes to synthesize more fine-grained and text-related image by introducing word-level semantic information in generative model. The self-attention solves the long distance reliance of image local-region features when generate image. With spectral normalization, the training of GANs become more stable than traditional GANs, which conduces to avoid model collapse and gradient vanishing or explosion. We conduct experiments on widely used Oxford-102 flower dataset and CUB bird dataset to validate our proposed method. During quantitative and non-quantitative experimental comparison, the results indicate that the proposed method achieves the best performance on Inception score (IS), Frechet Inception Distance (FID) and visual effect.

机译：随着生成模型的发展，基于特定变量的图像合成逐渐成为重要的研究主题。本文提出了一种基于光谱归一化的新型混合注意力生成对抗网络（HAGAN），用于文本到图像的合成。混合注意机制由文本图像交叉模式注意和图像子区域的自注意组成。跨模式注意机制通过在生成模型中引入单词级别的语义信息，有助于合成更细粒度且与文本相关的图像。自注意力解决了生成图像时图像局部区域特征的长距离依赖。通过频谱归一化，GAN的训练变得比传统GAN更稳定，这有助于避免模型崩溃和梯度消失或爆炸。我们对广泛使用的Oxford-102花卉数据集和CUB鸟类数据集进行了实验，以验证我们提出的方法。在定量和非定量实验比较中，结果表明该方法在初始得分（IS），Frechet初始距离（FID）和视觉效果方面均达到了最佳性能。

著录项

来源
《International conference on artificial neural networks》|2019年|483-495|共13页
会议地点
作者
Qingrong Cheng; Xiaodong Gu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Text to image synthesis; Spectral normalization; Self-attention; Cross-modal attention; Generative Adversarial Networks;

机译：文本到图像合成;频谱归一化;自我关注;跨模式关注;生成对抗网络;

相似文献

外文文献
中文文献
专利

1. MRP-GAN: Multi-resolution parallel generative adversarial networks for text-to-image synthesis [J] . Qi Zhongjian, Fan Chaogang, Xu Liangfeng, Pattern recognition letters . 2021,第Jula期

机译：MRP-GaN：用于文本到图像合成的多分辨率并行生成对抗网络
2. KT-GAN: Knowledge-Transfer Generative Adversarial Network for Text-to-Image Synthesis [J] . Hongchen Tan, Xiuping Liu, Meng Liu, IEEE Transactions on Image Processing . 2021,第1期

机译：KT-GaN：知识转移生成对抗网络，用于文本到图像合成
3. A survey on generative adversarial network-based text-to-image synthesis [J] . Zhou Rui, Jiang Cong, Xu Qingyang Neurocomputing . 2021,第Sepa3期

机译：基于生成的对抗网络文本到图像合成调查
4. Hybrid Attention Driven Text-to-image Synthesis via Generative Adversarial Networks [C] . Qingrong Cheng, Xiaodong Gu International conference on artificial neural networks . 2019

机译：混合注意通过生成的对抗网络驱动了图像的文本综合
5. Image Reconstruction Using Generative Adversarial Networks with a Hybrid Loss [D] . Liu, Ruiying. 2019

机译：使用具有混合损失的生成对抗网络进行图像重建
6. MAGAN: Mask Attention Generative Adversarial Network for Liver Tumor CT Image Synthesis [O] . Yang Liu, Lu Meng, Jianping Zhong 2021

机译：Magan：掩盖注意力发生肝脏肿瘤CT图像合成的生成妇女网络
7. Query is GAN: Scene Retrieval With Attentional Text-to-Image Generative Adversarial Network [O] . Rintaro Yanagi, Ren Togo, Takahiro Ogawa, 2019

机译：查询是GaN：场景检索与注意力文本到图像生成对抗网络

Hybrid Attention Driven Text-to-image Synthesis via Generative Adversarial Networks

摘要

著录项

相似文献

相关主题

期刊订阅