Stochastic Gradient

Stochastic gradient methods are fundamental algorithms for optimizing objective functions, particularly in large-scale machine learning, aiming to efficiently find optimal model parameters by iteratively updating them based on noisy gradient estimates. Current research focuses on improving convergence rates and robustness of these methods, particularly for non-convex functions and in distributed settings, exploring algorithms like Adam, SGHMC, and variance-reduced techniques, as well as addressing challenges posed by heavy-tailed noise and unbounded smoothness. These advancements have significant implications for training complex models like deep neural networks and for accelerating progress in various applications, including natural language processing, computer vision, and reinforcement learning.

Papers