Explaining Emotional Attitude Through the Task of Image-captioning
Author
Bisikalo, O.
Kovenko, V.
Bogach, I.
Chorna, O.
Бісікало, О. В.
Ковенко, В.
Богач, Ію
Date
2022Metadata
Show full item recordCollections
- Наукові роботи каф. АІІТ [263]
Abstract
Deep learning algorithms trained on huge datasets containing visual and textual information,
have shown to learn useful features for other downstream tasks. This implies that such
models understand the data on different levels of hierarchies. In this paper we study the
ability of SOTA (state-of-the-art) models for both texts and images to understand the
emotional attitude caused by a situation. For this purpose we gathered a small size dataset
based on IMDB-WIKI one and annotated it specifically for the task. In order to investigate
the ability of pretrained models to understand the data, the KNN clustering procedure over
representations of text and images is utilized in parallel. It’s shown that although used models
are not capable of understanding the task at hand, a transfer learning procedure based on
them helps to improve results on such tasks as image-captioning and sentiment analysis. We
then frame our problem as the task of image captioning and experiment with different
architectures and approaches to training. Finally, we show that adding additional biometric
features such as probabilities of emotions and gender probabilities improves the results and
leads to better understanding of emotional attitude.
Proceedings of the 6th International Conference on Computational Linguistics and Intelligent Systems (CoLInS 2022), Volume I: Main Conference, May 12–13, 2022, Gliwice, Poland
URI:
http://ir.lib.vntu.edu.ua//handle/123456789/36175