Knowledge Distillation
Knowledge distillation is a machine learning technique that transfers knowledge from a large, complex "teacher" model to a smaller, more efficient "student" model, aiming to improve the student's performance and reduce computational costs. Current research focuses on improving distillation methods for various model architectures, including convolutional neural networks, transformers, and large language models, often incorporating techniques like parameter-efficient fine-tuning, multi-task learning, and data augmentation to enhance knowledge transfer. This approach is significant because it enables the deployment of high-performing models on resource-constrained devices and addresses challenges related to model size, training time, and privacy in diverse applications such as image captioning, speech processing, and medical diagnosis.
Papers
A Theoretical Analysis of Soft-Label vs Hard-Label Training in Neural Networks
Saptarshi Mandal, Xiaojun Lin, R. Srikant
All You Need in Knowledge Distillation Is a Tailored Coordinate System
Junjie Zhou, Ke Zhu, Jianxin Wu
Multimodal Industrial Anomaly Detection by Crossmodal Reverse Distillation
Xinyue Liu, Jianyuan Wang, Biao Leng, Shuo Zhang
Dynamic Contrastive Knowledge Distillation for Efficient Image Restoration
Yunshuai Zhou, Junbo Qiao, Jincheng Liao, Wei Li, Simiao Li, Jiao Xie, Yunhang Shen, Jie Hu, Shaohui Lin
Wasserstein Distance Rivals Kullback-Leibler Divergence for Knowledge Distillation
Jiaming Lv, Haoyuan Yang, Peihua Li
DAKD: Data Augmentation and Knowledge Distillation using Diffusion Models for SAR Oil Spill Segmentation
Jaeho Moon, Jeonghwan Yun, Jaehyun Kim, Jaehyup Lee, Munchurl Kim
Efficient Gravitational Wave Parameter Estimation via Knowledge Distillation: A ResNet1D-IAF Approach
Xihua Zhu, Yiqian Yang, Fan Zhang
Active Data Curation Effectively Distills Large-Scale Multimodal Models
Vishaal Udandarao, Nikhil Parthasarathy, Muhammad Ferjad Naeem, Talfan Evans, Samuel Albanie, Federico Tombari, Yongqin Xian, Alessio Tonioni, Olivier J. Hénaff
Improved implicit diffusion model with knowledge distillation to estimate the spatial distribution density of carbon stock in remote sensing imagery
Zhenyu Yu