Clipped Stochastic Gradient Descent
Clipped stochastic gradient descent (SGD) is a technique used in machine learning to improve the robustness and privacy of training algorithms, particularly when dealing with noisy or heavy-tailed data distributions. Current research focuses on analyzing its convergence properties in various settings, including non-convex optimization and differentially private training, often employing adaptive step-size methods like AdaGrad and Adam, or exploring alternatives like SignSGD. This work is significant because it addresses challenges in training large models, enhances privacy guarantees, and improves the reliability of optimization algorithms across diverse applications, from natural language processing to computer vision.
Papers
October 26, 2024
September 5, 2024
June 6, 2024
May 30, 2024
May 27, 2024
May 23, 2024
April 17, 2024
February 10, 2024
November 7, 2023
June 27, 2023
June 15, 2023
May 25, 2023
May 23, 2023
February 6, 2023
December 15, 2022
August 23, 2022
June 27, 2022
May 21, 2022
April 20, 2022