Int8 Training

Int8 training aims to accelerate and improve the efficiency of training large neural networks by using 8-bit floating-point numbers instead of higher-precision formats like FP16 or FP32. Current research focuses on overcoming instabilities inherent in low-precision training, particularly for large language models (LLMs) and vision-language models, often employing techniques like modified activation functions and optimized AdamW variants. These advancements significantly reduce training time and memory requirements, leading to substantial improvements in the scalability and accessibility of training massive models for various applications, including natural language processing and computer vision.

Papers