Class Imbalance
Class imbalance, the uneven distribution of classes in a dataset, hinders the performance of machine learning models by biasing them towards the majority class. Current research focuses on mitigating this imbalance through various techniques, including data resampling (oversampling minority classes, undersampling majority classes), cost-sensitive learning (assigning different misclassification costs), and algorithmic modifications (e.g., adapting loss functions, employing novel regularization methods within models like GBDTs, and using contrastive learning). Addressing class imbalance is crucial for improving the fairness, robustness, and accuracy of machine learning models across diverse applications, from medical diagnosis and financial risk assessment to environmental monitoring and traffic sign recognition.
Papers
Powering Finetuning in Few-Shot Learning: Domain-Agnostic Bias Reduction with Selected Sampling
Ran Tao, Han Zhang, Yutong Zheng, Marios Savvides
A survey on learning from imbalanced data streams: taxonomy, challenges, empirical study, and reproducible experimental framework
Gabriel Aguiar, Bartosz Krawczyk, Alberto Cano