Quantization Aware Knowledge Distillation
Quantization-aware knowledge distillation (QKD) aims to create efficient, low-bit deep learning models by leveraging knowledge transfer from high-precision models during quantization. Current research focuses on improving the accuracy of quantized models, particularly for transformers and large language models, through techniques like self-supervised learning, novel quantization schemes (e.g., hybrid quantization), and optimized knowledge distillation strategies. This work is significant because it enables the deployment of complex deep learning models on resource-constrained devices, impacting applications ranging from image processing and natural language processing to autonomous driving and remote sensing.
Papers
June 10, 2024
May 9, 2024
March 26, 2024
March 17, 2024
November 10, 2023
September 22, 2023
July 20, 2023
July 12, 2023
July 1, 2023
May 18, 2023
December 31, 2022