Real World Noisy Datasets
Real-world datasets are frequently contaminated with noisy labels, hindering the performance of machine learning models. Current research focuses on developing robust training methods that mitigate the impact of this noise, employing techniques like sample selection (identifying and removing or correcting noisy samples), noise-robust loss functions, and the integration of external knowledge sources (e.g., large language models). These advancements are crucial for improving the reliability and generalizability of models trained on real-world data, impacting diverse fields from medical image analysis to autonomous driving where perfectly labeled data is often unavailable or prohibitively expensive to obtain.
Papers
July 5, 2022
June 24, 2022
March 29, 2022
March 26, 2022
March 15, 2022
February 17, 2022
January 26, 2022
December 5, 2021
November 29, 2021