Cross-disaster Domain Adaptation Using Co-training Variants
DOI:
https://doi.org/10.59297/dv6dh059Keywords:
Crisis Informatics, Co-training, Domain Adaptation, Social Media ClassificationAbstract
Automated classification of crisis-related social media posts is widely used to support humanitarian response; however, models trained on historical disasters often degrade when applied to new events due to cross-disaster domain shift. In emerging crises, labeled data is scarce while large volumes of unlabeled content accumulate rapidly, making effective domain adaptation critical for reliable deployment. In this work, we investigate semi-supervised domain adaptation for cross-disaster tweet classification under temporally realistic transfer settings, where each target event occurs strictly later than its source event. We evaluate adaptation performance across multiple humanitarian disasters under low-data regimes (5–50 labeled examples per class), distinguishing between within-disaster and cross-disaster transfer. We compare fully supervised fine-tuning, self-training, and unsupervised domain adaptation (UDA) against a structured co-training framework that leverages dual-view source–target supervision and cross-view pseudo-label exchange. We further study a family of controlled design variations that modify individual components—such as pseudo-label selection and mixup regularization—to analyze their impact on cross-event generalization and calibration. Results show these co-training variants consistently outperform UDA alone in low-resource settings, while different pseudo-label utilization strategies exhibit distinct trade-offs across disaster types and label budgets. By providing a temporally grounded benchmark and a structured analysis of adaptation mechanisms, this work contributes empirical guidance for designing more robust cross-disaster classification systems for crisis informatics.