Instance Dependent Noise

Instance-dependent noise (IDN) in machine learning datasets refers to label errors that are correlated with the features of individual data points, unlike simpler class-conditional noise. Current research focuses on developing robust algorithms that can effectively learn from such noisy data, employing techniques like generative models to modify features, graphical models to estimate noise rates, and ensemble methods to improve the reliability of label corrections. Addressing IDN is crucial for improving the accuracy and generalizability of machine learning models in real-world applications where noisy labels are prevalent, particularly in image classification, text classification, and other domains with complex data.

Papers