Biased Data
Biased data, stemming from skewed sampling or inherent societal prejudices, significantly impacts the fairness and accuracy of machine learning models across diverse applications, from healthcare recommendations to hiring processes. Current research focuses on identifying and mitigating these biases through techniques like data augmentation, debiasing algorithms (including variational autoencoders and self-supervised adversarial training), and fairness-aware model training, often employing transformer-based models and convolutional neural networks. Addressing data bias is crucial for ensuring equitable outcomes in AI systems and improving the reliability of scientific findings derived from data-driven analyses.
Papers
Toward More Generalized Malicious URL Detection Models
YunDa Tsai, Cayon Liow, Yin Sheng Siang, Shou-De Lin
Photometric Redshift Estimation with Convolutional Neural Networks and Galaxy Images: A Case Study of Resolving Biases in Data-Driven Methods
Q. Lin, D. Fouchez, J. Pasquet, M. Treyer, R. Ait Ouahmed, S. Arnouts, O. Ilbert