Distribution Data
Distribution data, encompassing both in-distribution (ID) and out-of-distribution (OOD) data, is a critical area of machine learning research focused on improving model robustness and reliability. Current research emphasizes developing methods for detecting and handling OOD data, including techniques that leverage graph theory, contrastive learning, and diffusion models, as well as adapting existing models through reweighting and fine-tuning strategies. This work is crucial for building safer and more dependable AI systems across various applications, from autonomous vehicles to medical image analysis, by mitigating the risks associated with unexpected or unseen data. A key challenge remains effectively handling imbalanced datasets and complex real-world distribution shifts.
Papers
Revisiting Out-of-distribution Robustness in NLP: Benchmark, Analysis, and LLMs Evaluations
Lifan Yuan, Yangyi Chen, Ganqu Cui, Hongcheng Gao, Fangyuan Zou, Xingyi Cheng, Heng Ji, Zhiyuan Liu, Maosong Sun
Exploring Simple, High Quality Out-of-Distribution Detection with L2 Normalization
Jarrod Haas, William Yolland, Bernhard Rabus