Data Distribution Shift

Data distribution shift, the discrepancy between training and deployment data distributions, is a critical challenge in machine learning, hindering model reliability and performance. Current research focuses on detecting these shifts using various metrics (e.g., distance measures, Kolmogorov-Smirnov test, Population Stability Index) and mitigating their impact through techniques like test-time adaptation, data augmentation refinement (e.g., ADLDA), and the development of performative prediction frameworks that account for model-induced shifts. Understanding and addressing data distribution shift is crucial for ensuring the robustness and safety of machine learning models across diverse applications, particularly in safety-critical domains like autonomous systems and financial fraud detection.

Papers