Explaining Emotional Attitude Through the Task of Image-captioning
Автор
Bisikalo, O.
Kovenko, V.
Bogach, I.
Chorna, O.
Бісікало, О. В.
Ковенко, В.
Богач, Ію
Дата
2022Metadata
Показати повну інформаціюCollections
- Наукові роботи каф. АІІТ [263]
Анотації
Deep learning algorithms trained on huge datasets containing visual and textual information,
have shown to learn useful features for other downstream tasks. This implies that such
models understand the data on different levels of hierarchies. In this paper we study the
ability of SOTA (state-of-the-art) models for both texts and images to understand the
emotional attitude caused by a situation. For this purpose we gathered a small size dataset
based on IMDB-WIKI one and annotated it specifically for the task. In order to investigate
the ability of pretrained models to understand the data, the KNN clustering procedure over
representations of text and images is utilized in parallel. It’s shown that although used models
are not capable of understanding the task at hand, a transfer learning procedure based on
them helps to improve results on such tasks as image-captioning and sentiment analysis. We
then frame our problem as the task of image captioning and experiment with different
architectures and approaches to training. Finally, we show that adding additional biometric
features such as probabilities of emotions and gender probabilities improves the results and
leads to better understanding of emotional attitude.
Proceedings of the 6th International Conference on Computational Linguistics and Intelligent Systems (CoLInS 2022), Volume I: Main Conference, May 12–13, 2022, Gliwice, Poland
URI:
http://ir.lib.vntu.edu.ua//handle/123456789/36175