22 December 2020 Attn-Eh ALN: complex text-to-image generation with attention-enhancing adversarial learning networks
Cunyi Lin, Xianwei Rong, Ming Liu, Xiaoyan Yu
Author Affiliations +
Abstract

Text-to-image generation can be widely applied in various fields, such as scene retrieval and computer-aided design. The existing approaches can generate realistic images from simple text descriptions, whereas rendering images from complex text descriptions is still not satisfactory for practical applications. To generate accurate high-resolution images from given complex texts, we proposed an attention-enhancing adversarial learning network (Attn-Eh ALN) based upon conditional generative adversarial networks and the attention mechanism. This model consists of an encoding module and a generative module. In the encoding module, we proposed a local attention driven encoding network that allows words in the text with different weights to enhance the semantic representation of specific object features. The attention mechanism is employed to capture more details while ensuring global information. This enables the details in the generated images to be more fine-grained. In the discriminating stage, we take multiple discriminators to distinguish the realness of the generated images, avoiding the bias caused by a single discriminator. Moreover, a semantic similarity judgment module is introduced to improve the semantic consistency between the text description and visual content. Experimental results on benchmark datasets indicate that Attn-Eh ALN generates favorable results in comparison with other state-of-the-art methods from qualitative and quantitative assessments.

© 2020 SPIE and IS&T 1017-9909/2020/$28.00© 2020 SPIE and IS&T
Cunyi Lin, Xianwei Rong, Ming Liu, and Xiaoyan Yu "Attn-Eh ALN: complex text-to-image generation with attention-enhancing adversarial learning networks," Journal of Electronic Imaging 29(6), 063014 (22 December 2020). https://doi.org/10.1117/1.JEI.29.6.063014
Received: 14 July 2020; Accepted: 30 November 2020; Published: 22 December 2020
Lens.org Logo
CITATIONS
Cited by 2 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Aluminum nitride

Gallium nitride

Computer programming

Data modeling

Visualization

Image resolution

Image processing

Back to Top