Paper
14 November 2023 A survey of the development of image captioning techniques
Fujun Zhang, Yinqiu Zhao
Author Affiliations +
Proceedings Volume 12934, Third International Conference on Computer Graphics, Image, and Virtualization (ICCGIV 2023); 1293410 (2023) https://doi.org/10.1117/12.3008005
Event: 2023 3rd International Conference on Computer Graphics, Image and Virtualization (ICCGIV 2023), 2023, Nanjing, China
Abstract
Image description technology is an important research direction in the field of deep learning. It is a task that uses computer vision techniques and natural language processing techniques to generate textual descriptions of the image features extracted from the corresponding images into high-level semantic information, i.e. to enable computers to learn the ability to "read pictures and talk". This paper collates several representative research methods that have emerged successively in the continuous development of image description. The popular template- and retrieval-based image description methods at the beginning of the research, and later, as deep learning flourishes, deep learning-based image description techniques have become mainstream, starting from end-to-end encoder-decoder, subsequently, the model began to be refined using the attention mechanism, and nowadays, new techniques based on Transformer technology and generative adversarial networks have greatly improved the accuracy of description.
(2023) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Fujun Zhang and Yinqiu Zhao "A survey of the development of image captioning techniques", Proc. SPIE 12934, Third International Conference on Computer Graphics, Image, and Virtualization (ICCGIV 2023), 1293410 (14 November 2023); https://doi.org/10.1117/12.3008005
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Image processing

Deep learning

Image fusion

Artificial intelligence

Image enhancement

Transformers

Back to Top