Model Distillation

Model distillation aims to create smaller, faster "student" models that approximate the performance of larger, more complex "teacher" models. Current research focuses on improving distillation techniques for various architectures, including transformers and diffusion models, often employing strategies like multi-step distillation, chain-of-thought prompting, and bespoke solvers to enhance efficiency and accuracy. This work is significant because it addresses the computational limitations of large models, enabling deployment on resource-constrained devices and accelerating inference times for applications ranging from image generation to natural language processing. Furthermore, research explores the impact of distillation on model fairness and interpretability.

Papers

November 22, 2022

A Generic Approach for Reproducible Model Distillation
Yunzhe Zhou, Peiru Xu, Giles Hooker
Symbolic Regression Interpretable Machine Learning Student Model Model Distillation

October 12, 2022

MiniALBERT: Model Distillation via Parameter-Efficient Recursive Transformers
Mohammadmahdi Nouriborji, Omid Rohanian, Samaneh Kouchaki, David A. Clifton
Natural Language Processing Pre Trained Language Model Model Distillation Recursive Transformer

May 23, 2022

IDEAL: Query-Efficient Data-Free Learning from Black-box Models
Jie Zhang, Chen Chen, Lingjuan Lyu
Knowledge Distillation Training Data Black Box Model Model Distillation Free Learning

January 21, 2022

January 15, 2022

Ensemble Transformer for Efficient and Accurate Ranking Tasks: an Application to Question Answering Systems
Yoshitomo Matsubara, Luca Soldaini, Eric Lind, Alessandro Moschitti
Application Proficiency High Efficiency Question Answering System Transformer Layer Large Transformer Model Model Distillation Ranking Consistency Large Scale Transformer Transformer Ensemble Heterogeneous Transformer

Model Distillation

Papers

A Generic Approach for Reproducible Model Distillation

MiniALBERT: Model Distillation via Parameter-Efficient Recursive Transformers

IDEAL: Query-Efficient Data-Free Learning from Black-box Models

Can Model Compression Improve NLP Fairness

AutoDistill: an End-to-End Framework to Explore and Distill Hardware-Efficient Language Models

Ensemble Transformer for Efficient and Accurate Ranking Tasks: an Application to Question Answering Systems