Imbalanced Data
Imbalanced data, where one class significantly outnumbers others, poses a major challenge for machine learning models, hindering their ability to accurately predict minority classes. Current research focuses on mitigating this imbalance through techniques like data augmentation (e.g., using generative adversarial networks or variational autoencoders), cost-sensitive learning, and novel loss functions designed to be robust to skewed distributions; various model architectures, including Recurrent Neural Networks, Transformers, and ensemble methods, are being adapted and evaluated for their effectiveness in these scenarios. Addressing imbalanced data is crucial for improving the reliability and fairness of machine learning models across diverse applications, from medical image analysis and fraud detection to cybersecurity and industrial process monitoring.
Papers
Systematic Review: Text Processing Algorithms in Machine Learning and Deep Learning for Mental Health Detection on Social Media
Yuchen Cao, Jianglai Dai, Zhongyan Wang, Yeyubei Zhang, Xiaorui Shen, Yunchong Liu, Yexin Tian
GReFEL: Geometry-Aware Reliable Facial Expression Learning under Bias and Imbalanced Data Distribution
Azmine Toushik Wasi, Taki Hasan Rafi, Raima Islam, Karlo Serbetar, Dong Kyu Chae
Classification Modeling with RNN-Based, Random Forest, and XGBoost for Imbalanced Data: A Case of Early Crash Detection in ASEAN-5 Stock Markets
Deri Siswara, Agus M. Soleh, Aji Hamim Wigena
GENIU: A Restricted Data Access Unlearning for Imbalanced Data
Chenhao Zhang, Shaofei Shen, Yawen Zhao, Weitong Tony Chen, Miao Xu