Label Imbalance

Label imbalance, where some classes in a dataset are significantly under-represented compared to others, poses a major challenge for machine learning models, leading to biased predictions and poor performance on minority classes. Current research focuses on developing techniques to mitigate this imbalance, including data augmentation strategies (e.g., using mixups and synthetic data), loss function modifications (e.g., weighted losses and focal loss), and adaptive optimization methods that tailor the learning process to the characteristics of each class. Addressing label imbalance is crucial for improving the fairness, robustness, and generalizability of machine learning models across diverse applications, particularly in domains like healthcare and e-commerce where data scarcity and class skews are common.

Papers