Paper ID: 2501.15941 • Published Jan 27, 2025

SAPPHIRE: Preconditioned Stochastic Variance Reduction for Faster Large-Scale Statistical Learning

Jingruo Sun, Zachary Frangella, Madeleine Udell

TL;DR

Get AI-generated summaries with premium

Regularized empirical risk minimization (rERM) has become important in data-intensive fields such as genomics and advertising, with stochastic gradient methods typically used to solve the largest problems. However, ill-conditioned objectives and non-smooth regularizers undermine the performance of traditional stochastic gradient methods, leading to slow convergence and significant computational costs. To address these challenges, we propose the \texttt{SAPPHIRE} (\textbf{S}ketching-based \textbf{A}pproximations for \textbf{P}roximal \textbf{P}reconditioning and \textbf{H}essian \textbf{I}nexactness with Variance-\textbf{RE}educed Gradients) algorithm, which integrates sketch-based preconditioning to tackle ill-conditioning and uses a scaled proximal mapping to minimize the non-smooth regularizer. This stochastic variance-reduced algorithm achieves condition-number-free linear convergence to the optimum, delivering an efficient and scalable solution for ill-conditioned composite large-scale convex machine learning problems. Extensive experiments on lasso and logistic regression demonstrate that \texttt{SAPPHIRE} often converges 20 times faster than other common choices such as \texttt{Catalyst}, \texttt{SAGA}, and \texttt{SVRG}. This advantage persists even when the objective is non-convex or the preconditioner is infrequently updated, highlighting its robust and practical effectiveness.

arXiv PDF

Topics

Empirical Risk Minimization Faster Pace Gradient Method Large Scale Learning Stochastic Gradient Variance Reduction Variance Reduced Stochastic Nonsmooth Regularization