Online Distillation

Online distillation is a machine learning technique that improves the efficiency and performance of student models by leveraging knowledge from a teacher model during concurrent training. Current research focuses on applying online distillation to diverse architectures, including transformers and graph neural networks, and explores strategies to enhance stability, address class imbalance (long-tailed distributions), and improve efficiency through techniques like parameter-efficient adaptations and adaptive sampling. This approach offers significant potential for accelerating model training, reducing computational costs, and improving the generalization ability of models across various domains, impacting fields like computer vision, natural language processing, and autonomous driving.

Papers