Crisis2Sum: An Exploratory Study on Disaster Summarization from Multiple Streams
DOI:
https://doi.org/10.59297/ndkt2f59Keywords:
Crisis Informatics, Disaster Summarization, Large Language Models, Integer Linear ProgrammingAbstract
Automatic summarization of natural and human-made disaster events is an important area to increase situational awareness for human response organizations and disaster management. However, the incorporation of multiple data sources poses a challenge to current summarization systems, as the typically large document collections exceed the input limits of neural models. Additionally, Large Language Models (LLM) often omit key information present at different positions in long context inputs. Furthermore, disaster reporting requires fine-grained information content and therefore relaxes the restriction to high compression rates, resulting into rather long summaries. In this work, we study different extractive and LLM-based abstractive baselines and highlight shortcomings in present approaches. Our experimental results on the CrisisFACTS datasets show that LLM-based approaches tend to fail in generating long informative summaries. Taking these limitations into account, we propose a disaster summarization framework and introduce query-focused extensions, which demonstrate advantages and superior performance over the baseline methods.