Mixed Precision
Mixed-precision computing in deep neural networks aims to improve efficiency and reduce resource consumption by using different numerical precisions (e.g., 16-bit, 8-bit, or even lower) for various parts of the network. Current research focuses on optimizing the allocation of these precisions, often employing techniques like neural architecture search and gradient-based methods, across diverse architectures including convolutional neural networks, transformers, and neural operators. This approach offers significant potential for deploying deep learning models on resource-constrained devices like microcontrollers and embedded systems, while also accelerating training and inference on more powerful hardware.
Papers
February 3, 2023
January 31, 2023
January 27, 2023
September 29, 2022
September 19, 2022
August 11, 2022
August 9, 2022
June 17, 2022
June 15, 2022
March 22, 2022
February 21, 2022