Knowledge Distillation
Knowledge distillation is a machine learning technique that transfers knowledge from a large, complex "teacher" model to a smaller, more efficient "student" model, aiming to improve the student's performance and reduce computational costs. Current research focuses on improving distillation methods for various model architectures, including convolutional neural networks, transformers, and large language models, often incorporating techniques like parameter-efficient fine-tuning, multi-task learning, and data augmentation to enhance knowledge transfer. This approach is significant because it enables the deployment of high-performing models on resource-constrained devices and addresses challenges related to model size, training time, and privacy in diverse applications such as image captioning, speech processing, and medical diagnosis.
Papers
Oh! We Freeze: Improving Quantized Knowledge Distillation via Signal Propagation Analysis for Large Language Models
Kartikeya Bhardwaj, Nilesh Prasad Pandey, Sweta Priyadarshi, Kyunggeun Lee, Jun Ma, Harris Teague
The Need for Speed: Pruning Transformers with One Recipe
Samir Khaki, Konstantinos N. Plataniotis
KDMCSE: Knowledge Distillation Multimodal Sentence Embeddings with Adaptive Angular margin Contrastive Learning
Cong-Duy Nguyen, Thong Nguyen, Xiaobao Wu, Anh Tuan Luu
From Two-Stream to One-Stream: Efficient RGB-T Tracking via Mutual Prompt Learning and Knowledge Distillation
Yang Luo, Xiqing Guo, Hao Li
Distilling Semantic Priors from SAM to Efficient Image Restoration Models
Quan Zhang, Xiaoyu Liu, Wei Li, Hanting Chen, Junchao Liu, Jie Hu, Zhiwei Xiong, Chun Yuan, Yunhe Wang
Learning to Project for Cross-Task Knowledge Distillation
Dylan Auty, Roy Miles, Benedikt Kolbeinsson, Krystian Mikolajczyk
From Large to Tiny: Distilling and Refining Mathematical Expertise for Math Word Problems with Weakly Supervision
Qingwen Lin, Boyan Xu, Zhengting Huang, Ruichu Cai
MMIDR: Teaching Large Language Model to Interpret Multimodal Misinformation via Knowledge Distillation
Longzheng Wang, Xiaohan Xu, Lei Zhang, Jiarui Lu, Yongxiu Xu, Hongbo Xu, Minghao Tang, Chuang Zhang
KnFu: Effective Knowledge Fusion
S. Jamal Seyedmohammadi, S. Kawa Atapour, Jamshid Abouei, Arash Mohammadi
TTT-KD: Test-Time Training for 3D Semantic Segmentation through Knowledge Distillation from Foundation Models
Lisa Weijler, Muhammad Jehanzeb Mirza, Leon Sick, Can Ekkazan, Pedro Hermosilla
Scheduled Knowledge Acquisition on Lightweight Vector Symbolic Architectures for Brain-Computer Interfaces
Yejia Liu, Shijin Duan, Xiaolin Xu, Shaolei Ren
Knowledge Distillation in YOLOX-ViT for Side-Scan Sonar Object Detection
Martin Aubard, László Antal, Ana Madureira, Erika Ábrahám
Select and Distill: Selective Dual-Teacher Knowledge Transfer for Continual Learning on Vision-Language Models
Yu-Chu Yu, Chi-Pin Huang, Jr-Jen Chen, Kai-Po Chang, Yung-Hsuan Lai, Fu-En Yang, Yu-Chiang Frank Wang