Compressed Stochastic Gradient Descent
Compressed Stochastic Gradient Descent (SGD) aims to accelerate distributed machine learning by reducing the communication overhead inherent in transmitting large gradients during model training. Current research focuses on developing adaptive compression techniques, such as quantization, sparsification, and low-rank approximations, often incorporating error feedback mechanisms and adaptive step-size methods to maintain accuracy despite compression. These advancements are crucial for scaling up training of large models on distributed systems, improving efficiency and reducing the time and energy costs associated with training complex machine learning models.
Papers
October 28, 2024
June 15, 2023
October 31, 2022
July 20, 2022