Knowledge Distillation
Knowledge distillation is a machine learning technique that transfers knowledge from a large, complex "teacher" model to a smaller, more efficient "student" model, aiming to improve the student's performance and reduce computational costs. Current research focuses on improving distillation methods for various model architectures, including convolutional neural networks, transformers, and large language models, often incorporating techniques like parameter-efficient fine-tuning, multi-task learning, and data augmentation to enhance knowledge transfer. This approach is significant because it enables the deployment of high-performing models on resource-constrained devices and addresses challenges related to model size, training time, and privacy in diverse applications such as image captioning, speech processing, and medical diagnosis.
Papers
Towards Multi-Morphology Controllers with Diversity and Knowledge Distillation
Alican Mertan, Nick Cheney
Brain-Inspired Continual Learning-Robust Feature Distillation and Re-Consolidation for Class Incremental Learning
Hikmat Khan, Nidhal Carla Bouaynaya, Ghulam Rasool
Distributed Learning for Wi-Fi AP Load Prediction
Dariush Salami, Francesc Wilhelmi, Lorenzo Galati-Giordano, Mika Kasslin
ReffAKD: Resource-efficient Autoencoder-based Knowledge Distillation
Divyang Doshi, Jung-Eun Kim
MTKD: Multi-Teacher Knowledge Distillation for Image Super-Resolution
Yuxuan Jiang, Chen Feng, Fan Zhang, David Bull
AI-KD: Towards Alignment Invariant Face Image Quality Assessment Using Knowledge Distillation
Žiga Babnik, Fadi Boutros, Naser Damer, Peter Peer, Vitomir Štruc
Robust feature knowledge distillation for enhanced performance of lightweight crack segmentation models
Zhaohui Chen, Elyas Asadi Shamsabadi, Sheng Jiang, Luming Shen, Daniel Dias-da-Costa
CLIP-Embed-KD: Computationally Efficient Knowledge Distillation Using Embeddings as Teachers
Lakshmi Nair
Improving Facial Landmark Detection Accuracy and Efficiency with Knowledge Distillation
Zong-Wei Hong, Yu-Chen Lin
Do We Really Need a Complex Agent System? Distill Embodied Agent into a Single Model
Zhonghan Zhao, Ke Ma, Wenhao Chai, Xuan Wang, Kewei Chen, Dongxu Guo, Yanting Zhang, Hongwei Wang, Gaoang Wang
Diffusion Time-step Curriculum for One Image to 3D Generation
Xuanyu Yi, Zike Wu, Qingshan Xu, Pan Zhou, Joo-Hwee Lim, Hanwang Zhang