Mixed Precision Training
Mixed precision training optimizes deep learning model training by using lower-precision numerical formats (e.g., 16-bit or even 8-bit) alongside higher-precision formats (e.g., 32-bit) to reduce memory usage and computational time. Current research focuses on developing efficient mixed-precision strategies for various architectures, including large language models (LLMs), convolutional neural networks (CNNs) for image classification, and physics-informed neural networks (PINNs) for scientific machine learning. This approach significantly impacts the field by accelerating training, lowering energy consumption, and enabling the training of larger, more complex models that were previously computationally infeasible.
Papers
September 17, 2024
September 12, 2024
August 28, 2024
August 6, 2024
July 1, 2024
January 30, 2024
December 9, 2023
October 27, 2023
December 24, 2021