GPUSQ ViT
GPUSQ-ViT research focuses on efficiently deploying Vision Transformers (ViTs) on GPUs by employing quantization techniques to reduce computational cost and memory footprint without significant accuracy loss. Current efforts concentrate on developing novel quantization methods tailored to the unique activation distributions within ViTs, often involving mixed-precision strategies and addressing outliers to improve performance, particularly at low bit-widths (e.g., 4-bit). This work is significant because it enables the practical deployment of powerful ViT models on resource-constrained devices, expanding their applicability in various computer vision tasks.
Papers
December 23, 2024
August 6, 2024
July 3, 2024
January 26, 2024
May 18, 2023
December 16, 2022