Mixed Precision
Mixed-precision computing in deep neural networks aims to improve efficiency and reduce resource consumption by using different numerical precisions (e.g., 16-bit, 8-bit, or even lower) for various parts of the network. Current research focuses on optimizing the allocation of these precisions, often employing techniques like neural architecture search and gradient-based methods, across diverse architectures including convolutional neural networks, transformers, and neural operators. This approach offers significant potential for deploying deep learning models on resource-constrained devices like microcontrollers and embedded systems, while also accelerating training and inference on more powerful hardware.
Papers
November 3, 2024
October 17, 2024
July 19, 2024
July 17, 2024
July 11, 2024
July 1, 2024
April 18, 2024
April 3, 2024
March 29, 2024
November 23, 2023
September 21, 2023
September 5, 2023
August 7, 2023
July 27, 2023
July 20, 2023
July 10, 2023
July 6, 2023
July 2, 2023
May 18, 2023