Distribution Shift Benchmark

Distribution shift benchmarks evaluate the robustness of machine learning models to variations in data distributions between training and deployment. Current research focuses on developing comprehensive benchmarks across diverse data types (tabular, time-series, images, code) and tasks (classification, anomaly detection, forecasting), often decomposing complex shifts into more granular components (e.g., covariate, prior, concept shifts). These benchmarks are crucial for advancing the development of more robust and generalizable models, ultimately improving the reliability and real-world applicability of machine learning systems in various domains.

Papers