Model Re Quantization

Model requantization focuses on efficiently converting trained deep learning models, particularly large models like Vision Transformers, into lower-precision representations for deployment on resource-constrained devices. Current research emphasizes developing algorithms that minimize accuracy loss during this conversion, exploring techniques like power-of-two scaling factors and leveraging generative adversarial networks for synthetic data calibration. This work is crucial for enabling the wider adoption of advanced deep learning models in edge computing and mobile applications by significantly reducing computational costs and memory footprint.

Papers