Model Re Quantization
Model requantization focuses on efficiently converting trained deep learning models, particularly large models like Vision Transformers, into lower-precision representations for deployment on resource-constrained devices. Current research emphasizes developing algorithms that minimize accuracy loss during this conversion, exploring techniques like power-of-two scaling factors and leveraging generative adversarial networks for synthetic data calibration. This work is crucial for enabling the wider adoption of advanced deep learning models in edge computing and mobile applications by significantly reducing computational costs and memory footprint.
Papers
November 7, 2024
May 30, 2024
December 20, 2023
August 1, 2023
May 10, 2023