M-CATNAT: A Multimodal dataset to analyze French tweets during natural disasters



Deep Learning, French multimodal data, Crisis management, CrisisMMD dataset


The proliferation of social media, especially platforms like X (formerly Twitter), has made available a large volume of real-time data valuable across diverse fields. During natural disasters, such data aids humanitarian efforts by providing crucial insights. However, processing this vast amount of data necessitates automated systems, often relying on annotated datasets for training. While supervised learning dominates this area, multilingual and multimodal annotated datasets are scarce. The present study addresses this gap by introducing M-CATNAT, a multimodal dataset of French tweets about natural disasters. Unlike previous datasets, M-CATNAT integrates annotations for texts, images, and their multimodal combination. Leveraging CrisisMMD guidelines, this work in progress aims to annotate 1,430 tweets, generating over 4,500 labels. The M-CATNAT dataset not only expands resources to non-English languages but also enhances multimodal analysis by furnishing three levels of annotation for each tweet (one per modality plus one for the whole tweet).


Download data is not yet available.




How to Cite

Farah, B., El Bachyr, O., Cleuziou, G. ., Halftermeyer, A., Gracianne, C. ., Auclair, S. ., Hafiane, A., & Canals, R. (2024). M-CATNAT: A Multimodal dataset to analyze French tweets during natural disasters. ISCRAM Proceedings, 21. http://ojs.iscram.org/index.php/Proceedings/article/view/52

Similar Articles

1-10 of 77

You may also start an advanced similarity search for this article.

Most read articles by the same author(s)