Sparse Update

Sparse update methods in machine learning aim to improve training efficiency and resource utilization by updating only a subset of model parameters during training, rather than all parameters. Current research focuses on developing algorithms that strategically select which parameters to update, leveraging techniques like gradient magnitude analysis, structured sparsity, and randomized selection, often within the context of large language models (LLMs), vision transformers (ViTs), and federated learning. This approach offers significant benefits, including reduced memory footprint, faster training times, and lower communication costs, impacting various applications from on-device training to high-resolution image processing and large-scale model deployment.

Papers