Quantization Error
Quantization error arises from representing continuous-valued data (e.g., neural network weights and activations) using a limited number of bits, impacting model accuracy and efficiency. Current research focuses on mitigating this error in large language models (LLMs) and vision transformers (ViTs), employing techniques like post-training quantization, quantization-aware training, and the development of novel quantization algorithms (e.g., those incorporating learned rotations or adaptive clipping). Reducing quantization error is crucial for deploying large models on resource-constrained devices, improving energy efficiency, and enabling wider accessibility of advanced AI applications.
Papers
August 24, 2023
August 13, 2023
August 1, 2023
July 11, 2023
June 30, 2023
June 5, 2023
May 30, 2023
April 26, 2023
March 10, 2023
December 21, 2022
December 11, 2022
December 6, 2022
November 30, 2022
November 8, 2022
October 17, 2022
October 7, 2022
August 25, 2022
July 31, 2022
June 15, 2022