Label Corruption

Label corruption, the presence of incorrect labels in training datasets, significantly hinders the performance and generalization of machine learning models. Current research focuses on developing robust algorithms and model architectures that can effectively detect and mitigate the impact of noisy labels, employing techniques such as dynamic distribution calibration, adversarial training, and post-training correction methods leveraging verified samples or clustering of training losses. Addressing label corruption is crucial for improving the reliability and accuracy of machine learning models across various applications, particularly where obtaining perfectly labeled data is expensive or impractical.

Papers