Selection Bias
Selection bias, the systematic distortion of a dataset due to non-random sampling, poses a significant challenge across numerous fields, from healthcare and machine learning to economics and natural language processing. Current research focuses on developing methods to mitigate this bias, employing techniques like propensity score matching, inverse probability weighting, and novel loss functions designed to account for biased data distributions within various model architectures, including graph neural networks and large language models. Addressing selection bias is crucial for ensuring the reliability and generalizability of research findings and the fairness and accuracy of machine learning applications, ultimately leading to more robust and trustworthy conclusions in diverse scientific and practical contexts.
Papers
Large Language Models Are Not Robust Multiple Choice Selectors
Chujie Zheng, Hao Zhou, Fandong Meng, Jie Zhou, Minlie Huang
A Causal Perspective on Loan Pricing: Investigating the Impacts of Selection Bias on Identifying Bid-Response Functions
Christopher Bockel-Rickermann, Sam Verboven, Tim Verdonck, Wouter Verbeke