Paper ID: 2308.03102

Learning-Rate-Free Learning: Dissecting D-Adaptation and Probabilistic Line Search

Max McGuinness

This paper explores two recent methods for learning rate optimisation in stochastic gradient descent: D-Adaptation (arXiv:2301.07733) and probabilistic line search (arXiv:1502.02846). These approaches aim to alleviate the burden of selecting an initial learning rate by incorporating distance metrics and Gaussian process posterior estimates, respectively. In this report, I provide an intuitive overview of both methods, discuss their shared design goals, and devise scope for merging the two algorithms.

Submitted: Aug 6, 2023

Topics

Stochastic Gradient Descent
Adaptation Concern
Learning Rate
Optimal Rate
Learning Rate Free
Stochastic Line Search

Links

arXiv PDF