Sparse Gradient
Sparse gradient methods aim to improve the efficiency and scalability of training large machine learning models by focusing computation on the most important gradient components. Current research emphasizes developing algorithms that effectively estimate and utilize sparse gradients within various architectures, including Mixture-of-Experts (MoE) models and spiking neural networks (SNNs), as well as optimizing distributed training processes through techniques like compressed sensing and efficient all-reduce operations. This focus on sparsity offers significant potential for reducing computational costs, improving communication efficiency in distributed settings, and enhancing the privacy and robustness of machine learning models.
Papers
September 18, 2024
July 8, 2024
June 28, 2024
May 30, 2024
May 27, 2024
April 25, 2024
April 16, 2024
September 5, 2023
April 15, 2023
April 3, 2023
July 6, 2022