Paper ID: 2212.00433

Regularization Trade-offs with Fake Features

Martin Hellkvist, Ayça Özçelikkale, Anders Ahlén

Recent successes of massively overparameterized models have inspired a new line of work investigating the underlying conditions that enable overparameterized models to generalize well. This paper considers a framework where the possibly overparametrized model includes fake features, i.e., features that are present in the model but not in the data. We present a non-asymptotic high-probability bound on the generalization error of the ridge regression problem under the model misspecification of having fake features. Our highprobability results provide insights into the interplay between the implicit regularization provided by the fake features and the explicit regularization provided by the ridge parameter. Numerical results illustrate the trade-off between the number of fake features and how the optimal ridge parameter may heavily depend on the number of fake features.

Submitted: Dec 1, 2022