Enhancing Emergency Post Classification through Image Information Amplification via Large Language Models
DOI:
https://doi.org/10.59297/x3m25n43Abstract
Real-time information extracted from social media platforms can be highly valuable during emergencies. For example, reports and direct witnesses can help build situational awareness in the early phases of an emergency, with the potential to save lives. However, suitable techniques for selecting relevant data are needed to gather this information from large-scale social media streams and utilize it effectively.
Given the multimedia nature of these streams, selection techniques should simultaneously understand textual and image information, as previous studies highlighted. Leveraging recent advances in language and vision models, we propose and evaluate a method working with a homogeneous, text-only representation for the different modalities of social media posts. Experiments on established and novel datasets, including video data, show that the proposed method achieves state-of-the-art performances while providing a highly general and plug-and-play approach to multimodal data filtering.