Memory Efficient Training

Memory-efficient training focuses on reducing the substantial memory demands of training large neural networks, particularly large language models (LLMs) and transformers, while maintaining or improving performance. Current research explores diverse techniques, including optimizing mini-batch selection, employing low-rank approximations (like LoRA), utilizing reversible architectures, and developing novel quantization and pruning methods to compress activations and gradients. These advancements are crucial for democratizing access to powerful AI models by enabling training on more readily available hardware and reducing the environmental impact of computationally intensive training processes.

Papers