Feature Distillation
Feature distillation is a machine learning technique aimed at transferring knowledge from a large, complex "teacher" model to a smaller, more efficient "student" model, improving the student's performance while reducing computational costs. Current research focuses on refining distillation methods, particularly feature distillation, across diverse architectures including convolutional neural networks (CNNs), vision transformers (ViTs), and diffusion models, often incorporating techniques like masked feature reconstruction, attention mechanisms, and contrastive learning to enhance knowledge transfer. This approach is significant for deploying large models in resource-constrained environments, such as mobile devices and medical image analysis, and for improving the generalization and robustness of smaller models.
Papers
On Explaining Knowledge Distillation: Measuring and Visualising the Knowledge Transfer Process
Gereziher Adhane, Mohammad Mahdi Dehshibi, Dennis Vetter, David Masip, Gemma Roig
GAGS: Granularity-Aware Feature Distillation for Language Gaussian Splatting
Yuning Peng, Haiping Wang, Yuan Liu, Chenglu Wen, Zhen Dong, Bisheng Yang