Paper ID: 2311.06960

Robust Regression over Averaged Uncertainty

Dimitris Bertsimas, Yu Ma

We propose a new formulation of robust regression by integrating all realizations of the uncertainty set and taking an averaged approach to obtain the optimal solution for the ordinary least squares regression problem. We show that this formulation recovers ridge regression exactly and establishes the missing link between robust optimization and the mean squared error approaches for existing regression problems. We further demonstrate that the condition of this equivalence relies on the geometric properties of the defined uncertainty set. We provide exact, closed-form, in some cases, analytical solutions to the equivalent regularization strength under uncertainty sets induced by $\ell_p$ norm, Schatten $p$-norm, and general polytopes. We then show in synthetic datasets with different levels of uncertainties, a consistent improvement of the averaged formulation over the existing worst-case formulation in out-of-sample performance. In real-world regression problems obtained from UCI datasets, similar improvements are seen in the out-of-sample datasets.

Submitted: Nov 12, 2023