Evaluating the Performance of AI in Crisis Detection: A Multi-Scenario Hindcast of Extreme Precipitation Forecasts

Authors

  • Feng Huang School of Safety Science, Tsinghua University https://orcid.org/0009-0006-9587-8925
  • Guofeng Su School of Safety Science, Tsinghua University, China Key Laboratory of Investigation on Disaster and Accident, Ministry of Emergency Management, China
  • Lida Huang School of Safety Science, Tsinghua University
  • Tao Chen School of Safety Science, Tsinghua University
  • Jing Zhang School of Safety Science, Tsinghua University

DOI:

https://doi.org/10.59297/sfbr7n23

Keywords:

Crisis Detection, Extreme Precipitation, AI Weather Prediction, Hindcast Evaluation

Abstract

Artificial Intelligence Weather Prediction (AIWP) models excel in global mean-error metrics, yet their efficacy in detecting low-probability, high-impact extreme events--critical for emergency response--remains under-examined. This study evaluates three leading models (GraphCast, FuXi, and Artificial Intelligence Forecasting System (AIFS)) against satellite observations and a numerical baseline across four diverse historical crises. Using a crisis-centric evaluation framework comprising Peak Amplitude Ratio (PAR), Spatial Correlation (SC), Root Mean Square Error (RMSE), volumetric Bias, and the Symmetric Extremal Dependence Index (SEDI), preliminary results reveal a systemic intensity deficit in AIWP models. While GFS maintains a PAR above 0.65 across most scenarios, AI models underestimate peak rainfall by over 90% and exhibit significant spatial displacement. These findings suggest that inherent statistical smoothing transforms catastrophic signals into benign forecasts. Consequently, over-reliance on current AIWP models for crisis detection may yield a false sense of security, potentially exacerbating rather than mitigating emergency vulnerabilities.

Downloads

Download data is not yet available.

Downloads

Published

2026-05-22

Conference Proceedings Volume

Section

ISCRAM Proceedings

How to Cite

Huang, F., Su, G., Huang, L., Chen, T., & Zhang, J. (2026). Evaluating the Performance of AI in Crisis Detection: A Multi-Scenario Hindcast of Extreme Precipitation Forecasts. Proceedings of the International ISCRAM Conference, 23. https://doi.org/10.59297/sfbr7n23

Similar Articles

151-160 of 188

You may also start an advanced similarity search for this article.