Online Gradient Descent

Online gradient descent (OGD) is an iterative optimization method used to efficiently update model parameters based on sequentially arriving data, aiming to minimize cumulative error over time. Current research focuses on improving OGD's efficiency and robustness in various settings, including handling outliers, adapting to non-convex loss functions, and managing memory constraints for large-scale models like LLMs, often employing techniques like subspace descent and variance reduction. These advancements are significant for applications requiring real-time learning from streaming data, such as online control systems, recommendation systems, and large language model training, where efficient and robust optimization is crucial.

Papers