Quantized Neural Network

Quantized neural networks (QNNs) aim to reduce the computational cost and memory footprint of deep learning models by representing weights and activations using lower-precision integer arithmetic, rather than 32-bit floating-point numbers. Current research focuses on improving the accuracy of QNNs through techniques like quantization-aware training, exploring different quantization schemes (e.g., mixed-precision, stochastic quantization), and developing efficient algorithms for training and verification. This field is significant because QNNs enable the deployment of deep learning on resource-constrained devices, impacting applications ranging from mobile and edge computing to embedded systems and Internet of Things (IoT) devices.

Papers