Quantized Neural Network
Quantized neural networks (QNNs) aim to reduce the computational cost and memory footprint of deep learning models by representing weights and activations using lower-precision integer arithmetic, rather than 32-bit floating-point numbers. Current research focuses on improving the accuracy of QNNs through techniques like quantization-aware training, exploring different quantization schemes (e.g., mixed-precision, stochastic quantization), and developing efficient algorithms for training and verification. This field is significant because QNNs enable the deployment of deep learning on resource-constrained devices, impacting applications ranging from mobile and edge computing to embedded systems and Internet of Things (IoT) devices.
Papers
November 7, 2022
August 30, 2022
July 19, 2022
May 25, 2022
April 1, 2022
February 15, 2022
November 29, 2021
November 15, 2021