Adversarial invariant-specific representations fusion network for multimodal sentiment analysis

Jing He; Binghui Su; Zhenwen Sheng; Changfan Zhang; Haonan Yang

doi:10.1117/12.2680996

8 June 2023 Adversarial invariant-specific representations fusion network for multimodal sentiment analysis

Jing He, Binghui Su, Zhenwen Sheng, Changfan Zhang, Haonan Yang

Proceedings Volume 12707, International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2023); 127073R (2023) https://doi.org/10.1117/12.2680996
Event: International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2023), 2023, Changsha, China

Abstract

Multimodal Sentiment Analysis (MSA) has achieved substantial progress as an open and practical research field. Several methods in MSA focus on exploring the methods for multimodal data fusion to improve the model performance. However, the heterogeneity gaps pose a significant challenge against the fusion interactions of the multimodal data. This work uses a new Multimodal Sentiment Analysis (MSA) model, the adversarial invariant-specific representations fusion (AISRF) network. AISRF is proposed to achieve modality-invariant representations by narrowing the distribution gaps among different modalities. At the same time, the integrity of the modality-specific representation is maintained. First, the heterogeneity gaps among the modalities are reduced by invoking an adversarial encoder-regressor, and thus the modality-invariant representations are obtained. Second, the decoders are employed to reconstruct the modality-invariant representations and obtain the modality-specific representations. Finally, the cross-modal attention method has been used to perform the cross-modal interactions on the invariant-specific representations from different modalities to perform efficient multimodal fusion. Experiments for comparison with other baseline models have been performed on the prevailing benchmark datasets, viz., CMU-MOSI and CMU-MOSEI, and the results demonstrate that the AISRF model is superior to the baseline models in the multiple evaluation indices.

Citation Download Citation

Jing He, Binghui Su, Zhenwen Sheng, Changfan Zhang, and Haonan Yang "Adversarial invariant-specific representations fusion network for multimodal sentiment analysis", Proc. SPIE 12707, International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2023), 127073R (8 June 2023); https://doi.org/10.1117/12.2680996

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available

Members: $17.00

Non-members: $21.00 ADD TO CART

PROCEEDINGS
13 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY

GET CITATION

RIGHTS & PERMISSIONS

Get copyright permission Get copyright permission on Copyright Marketplace

KEYWORDS

Machine learning

Data modeling

Visualization

Adversarial training

Performance modeling

Data fusion

Feature extraction

Show All Keywords

Keywords/Phrases

Search In:

Publication Years