Teacher Model
Teacher models are large, pre-trained models used in knowledge distillation to train smaller, more efficient student models while preserving performance. Current research focuses on improving the accuracy and efficiency of this knowledge transfer, exploring techniques like data augmentation, loss function optimization (e.g., MSE loss), and novel architectures such as multi-teacher and online distillation frameworks. This work is significant because it addresses the computational cost and resource limitations associated with deploying large language and vision models, enabling broader accessibility and application in various fields including object detection, natural language processing, and ecological monitoring.
Papers
TS-Align: A Teacher-Student Collaborative Framework for Scalable Iterative Finetuning of Large Language Models
Chen Zhang, Chengguang Tang, Dading Chong, Ke Shi, Guohua Tang, Feng Jiang, Haizhou Li
GKT: A Novel Guidance-Based Knowledge Transfer Framework For Efficient Cloud-edge Collaboration LLM Deployment
Yao Yao, Zuchao Li, Hai Zhao