Vision Transformer Quantization
Vision transformer (ViT) quantization aims to reduce the computational cost and memory footprint of these powerful but resource-intensive models by representing their weights and activations using lower-precision numbers. Current research focuses on developing effective quantization techniques, particularly post-training quantization methods, for various ViT architectures like ViT, DeiT, and Swin Transformer, often employing strategies such as mixed-precision quantization and novel quantization schemes to mitigate performance degradation. These efforts are significant because they enable the deployment of ViTs on resource-constrained devices, expanding their applicability in areas like mobile and embedded vision systems.
Papers
July 29, 2024
May 1, 2024
April 1, 2024
January 20, 2024
November 16, 2023
July 1, 2023
May 21, 2023
January 19, 2022