Image-text crisis tweet categorization:a caption-based approach

Badreddine Farah; Guillaume  Cleuziou; Cécile  Gracianne; Adel Hafiane; Anaïs Halftermeyer; Raphaël Canals

doi:10.59297/9j4kjp22

Authors

Badreddine Farah University of Orléans, INSA-CVL, LIFO, EA 4022, F45067 Orléans, France
Guillaume Cleuziou University of Orléans, INSA-CVL, LIFO, EA 4022, F45067 Orléans, France https://orcid.org/0000-0002-2885-1152
Cécile Gracianne BRGM, F45060 Orléans, France https://orcid.org/0000-0003-4232-5359
Adel Hafiane INSA-CVL, University of Orléans, PRISME, EA 4229, F18022 Bourges, France https://orcid.org/0000-0003-3185-9996
Anaïs Halftermeyer University of Orléans, INSA-CVL, LIFO, EA 4022, F45067 Orléans, France https://orcid.org/0000-0003-1069-191X
Raphaël Canals University of Orléans, INSA-CVL, PRISME, EA 4229, F45072 Orléans, France https://orcid.org/0000-0001-9100-7539

DOI:

https://doi.org/10.59297/9j4kjp22

Keywords:

Deep Learning, Multimodal data, text/image fusion, Crisis Data

Abstract

The growth of social media usage this last decade has made available a massive and valuable volume of multimedia data. However, the lack of large multimodal annotated datasets, along with the inherent noise and the diversity of multimodal relations in this type of data presents challenges for machine learning methods. Unlike classic multimodal data, social media data comes with a large diversity of relations between image and text making the interaction between the two modalities more difficult.
Previous research concentrated on fusion strategies with separate encoders for each modality. This paper introduces CMB (Caption-based Multimodal BERT), a method of classifying crisis-related social media posts by utilizing information from both images and texts. CMB translates the image modality into a text-compatible space, facilitating intermodal interaction.
Furthermore, CMB presents training opportunities to enhance the model's robustness to missing modalities. Experimental results show that CMB is competitive with well-established, costly, and manually crafted multimodal models.

Downloads

Download data is not yet available.

Image-text crisis tweet categorization:a caption-based approach

Authors

DOI:

Keywords:

Abstract

Downloads

Downloads

Published

Conference Proceedings Volume

Section

How to Cite

Similar Articles

Most read articles by the same author(s)

Latest publications

Language

Information