Asynchronous Stochastic Gradient Descent

Asynchronous Stochastic Gradient Descent (ASGD) is a distributed optimization technique aiming to accelerate machine learning training by allowing worker nodes to update a shared model independently and asynchronously, thus mitigating the bottlenecks of synchronous approaches. Current research focuses on improving ASGD's robustness to communication delays and data heterogeneity through techniques like delayed gradient aggregation, adaptive step sizes, and novel scheduling algorithms, often applied to large-scale models such as deep neural networks. These advancements are significant because they enable faster and more efficient training of complex models across diverse hardware and network conditions, impacting both the scalability of machine learning research and the deployment of real-world applications.

Papers

September 2, 2024

GAS: Generative Activation-Aided Asynchronous Split Federated Learning
Jiarong Yang, Yuan Liu
Generative Network Asynchronous Federated Learning Split Federated Learning Asynchronous Stochastic Gradient Descent

August 29, 2024

High-Dimensional Sparse Data Low-rank Representation via Accelerated Asynchronous Parallel Stochastic Gradient Descent
Qicong Hu, Hao Wu
High Dimensional High Dimension LOw Rank Asynchronous Stochastic Gradient Descent

June 17, 2024

Distributed Stochastic Gradient Descent with Staleness: A Stochastic Delay Differential Equation Based Framework
Siyuan Yu, Wei Chen, H. Vincent Poor
New Framework Gradient Descent Stochastic Gradient Descent Asynchronous Stochastic Gradient Descent Staleness Problem Stochastic Delay

May 27, 2024

Dual-Delayed Asynchronous SGD for Arbitrarily Heterogeneous Data
Xiaolu Wang, Yuchang Sun, Hoi-To Wai, Jun Zhang
Heterogeneous Data Asynchronous Training Asynchronous Stochastic Gradient Descent

January 17, 2024

Asynchronous Local-SGD Training for Language Modeling
Bo Liu, Rachita Chhaparia, Arthur Douillard, Satyen Kale, Andrei A. Rusu, Jiajun Shen, Arthur Szlam, Marc'Aurelio Ranzato
Local SGD SGD Step Asynchronous Stochastic Gradient Descent

September 7, 2023

Convergence Analysis of Decentralized ASGD
Mauro DL Tosi, Martin Theobald
Stochastic Gradient Descent Convergence Rate Convergence Analysis Mini Batch Asynchronous Stochastic Gradient Descent Asynchronous SGD Convergence Rate Analysis

July 21, 2023

Robust Fully-Asynchronous Methods for Distributed Training over General Architecture
Zehan Zhu, Ye Tian, Yan Huang, Jinming Xu, Shibo He
Synchronization Parameter Update Barrier Gradient Tracking Asynchronous Algorithm Asynchronous Stochastic Gradient Descent Fast FedUL Synchronous Algorithm

June 15, 2022

Asynchronous SGD Beats Minibatch SGD Under Arbitrary Delays
Konstantin Mishchenko, Francis Bach, Mathieu Even, Blake Woodworth
Stochastic Gradient Descent Asynchronous Stochastic Gradient Descent Arbitrary Delay Asynchronous Stochastic Adaptive Stepsizes

March 24, 2022

Locally Asynchronous Stochastic Gradient Descent for Decentralised Deep Learning
Tomer Avidor, Nadav Tal Israel
Decentralized Deep Learning Asynchronous Decentralized Asynchronous Stochastic Gradient Descent

January 31, 2022

Lightweight Projective Derivative Codes for Compressed Asynchronous Gradient Descent
Pedro Soto, Ilia Ilmer, Haibin Guan, Jun Li
Gradient Descent Lossy Compression Image Derivative Asynchronous Stochastic Gradient Descent Directional Derivative

December 27, 2021

AET-SGD: Asynchronous Event-triggered Stochastic Gradient Descent
Nhuong Nguyen, Song Han
Event Based SGD Style Asynchronous Stochastic Gradient Descent Trigger Free Event Detection