Biased Feature
Biased features in machine learning models refer to spurious correlations in training data that lead models to rely on irrelevant attributes (e.g., race, gender) for predictions, rather than task-relevant information. Current research focuses on identifying and mitigating these biases through various techniques, including adversarial training, feature orthogonalization, and dataset refinement methods applied to diverse model architectures like CNNs and LLMs. Addressing biased features is crucial for ensuring fairness, improving model generalization, and building trustworthy AI systems across various applications, from image recognition to natural language processing.
Papers
September 13, 2022
May 26, 2022
February 21, 2022
February 16, 2022
December 14, 2021