Feature Distillation

Feature distillation is a machine learning technique aimed at transferring knowledge from a large, complex "teacher" model to a smaller, more efficient "student" model, improving the student's performance while reducing computational costs. Current research focuses on refining distillation methods, particularly feature distillation, across diverse architectures including convolutional neural networks (CNNs), vision transformers (ViTs), and diffusion models, often incorporating techniques like masked feature reconstruction, attention mechanisms, and contrastive learning to enhance knowledge transfer. This approach is significant for deploying large models in resource-constrained environments, such as mobile devices and medical image analysis, and for improving the generalization and robustness of smaller models.

Papers