Paper ID: 2409.14637

Not Only the Last-Layer Features for Spurious Correlations: All Layer Deep Feature Reweighting

Humza Wajid Hameed, Geraldin Nanfack, Eugene Belilovsky

Spurious correlations are a major source of errors for machine learning models, in particular when aiming for group-level fairness. It has been recently shown that a powerful approach to combat spurious correlations is to re-train the last layer on a balanced validation dataset, isolating robust features for the predictor. However, key attributes can sometimes be discarded by neural networks towards the last layer. In this work, we thus consider retraining a classifier on a set of features derived from all layers. We utilize a recently proposed feature selection strategy to select unbiased features from all the layers. We observe this approach gives significant improvements in worst-group accuracy on several standard benchmarks.

Submitted: Sep 23, 2024