Model Re Quantization

Model requantization focuses on efficiently converting trained deep learning models, particularly large models like Vision Transformers, into lower-precision representations for deployment on resource-constrained devices. Current research emphasizes developing algorithms that minimize accuracy loss during this conversion, exploring techniques like power-of-two scaling factors and leveraging generative adversarial networks for synthetic data calibration. This work is crucial for enabling the wider adoption of advanced deep learning models in edge computing and mobile applications by significantly reducing computational costs and memory footprint.

Papers

November 7, 2024

SVDQunat: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
Muyang Li, Yujun Lin, Zhekai Zhang, Tianle Cai, Xiuyu Li, Junxian Guo, Enze Xie, Chenlin Meng, Jun-Yan Zhu, Song Han
Full Model Diffusion Explainer Connected Component Model Re Quantization

May 30, 2024

P$^2$-ViT: Power-of-Two Post-Training Quantization and Acceleration for Fully Quantized Vision Transformer
Huihong Shi, Xin Cheng, Wendong Mao, Zhongfeng Wang
Vision Transformer Quantization Operator Low Rate ACCELERATION Mixed Precision Quantization ViT Model Model Re Quantization

December 20, 2023

Fed-QSSL: A Framework for Personalized Federated Learning under Bitwidth and Data Heterogeneity
Yiyue Chen, Haris Vikalo, Chianing Wang
Self Supervised Learning New Framework Data Heterogeneity Personalized Federated Learning Centralized Machine Learning Low Bit Quantization Learning Model Re Quantization

August 1, 2023

MRQ:Support Multiple Quantization Schemes through Model Re-Quantization
Manasa Manohara, Sankalp Dayal, Tariq Afzal, Rahul Bakshi, Kahkuen Fu
Model Quantization Model Re Quantization

May 10, 2023

Post-training Model Quantization Using GANs for Synthetic Data Generation
Athanasios Masouris, Mansi Sharma, Adrian Boguszewski, Alexander Kozlov, Zhuo Wu, Raymond Lo
Generative Adversarial Network Synthetic Data GAN Model Synthetic Data Generation Post Training Product Quantization Model Re Quantization

Model Re Quantization

Papers

SVDQunat: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models

P$^2$-ViT: Power-of-Two Post-Training Quantization and Acceleration for Fully Quantized Vision Transformer

Fed-QSSL: A Framework for Personalized Federated Learning under Bitwidth and Data Heterogeneity

MRQ:Support Multiple Quantization Schemes through Model Re-Quantization

Post-training Model Quantization Using GANs for Synthetic Data Generation