Bias Attribute

Bias attributes, spurious correlations between features and labels in datasets, significantly hinder the fairness and generalizability of machine learning models, particularly in image classification and natural language understanding. Current research focuses on developing methods to identify and mitigate these biases, employing techniques like adversarial filtering, continuous learning, and attention-based information bottlenecks to disentangle intrinsic features from bias-related ones. This work is crucial for building more robust and equitable AI systems, improving model performance on underrepresented groups and enhancing trust in AI applications across various domains.

Papers