Paper
14 August 2019 An image caption model incorporating high-level semantic features
Author Affiliations +
Proceedings Volume 11179, Eleventh International Conference on Digital Image Processing (ICDIP 2019); 1117917 (2019) https://doi.org/10.1117/12.2540579
Event: Eleventh International Conference on Digital Image Processing (ICDIP 2019), 2019, Guangzhou, China
Abstract
Encoder-decoder framework attracts great interests in image caption. It focuses on the extraction of low-level features and achieves good results. The performance can be further improved if high-level semantics are considered. In this work, we propose a new image caption model incorporating high-level semantic features through an revised Convolutional Neural Network(CNN). Both the low-level image features and high-level semantic features are fed into the Long-Short Term Memory networks(LSTMs) to acquire natural sentence descriptions. We show in a number of experiments on Flickr8K and Flickr30K datasets that our method outperforms most standard network baseline for image caption.
© (2019) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Zhiwang Luo, Jiwei Hu, Quan Liu, and Jiamei Deng "An image caption model incorporating high-level semantic features", Proc. SPIE 11179, Eleventh International Conference on Digital Image Processing (ICDIP 2019), 1117917 (14 August 2019); https://doi.org/10.1117/12.2540579
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Associative arrays

Data modeling

Feature extraction

Principal component analysis

Convolutional neural networks

Computing systems

Image processing

RELATED CONTENT


Back to Top