Knowledge Distillation
Knowledge distillation is a machine learning technique that transfers knowledge from a large, complex "teacher" model to a smaller, more efficient "student" model, aiming to improve the student's performance and reduce computational costs. Current research focuses on improving distillation methods for various model architectures, including convolutional neural networks, transformers, and large language models, often incorporating techniques like parameter-efficient fine-tuning, multi-task learning, and data augmentation to enhance knowledge transfer. This approach is significant because it enables the deployment of high-performing models on resource-constrained devices and addresses challenges related to model size, training time, and privacy in diverse applications such as image captioning, speech processing, and medical diagnosis.
Papers
MH-pFLID: Model Heterogeneous personalized Federated Learning via Injection and Distillation for Medical Data Analysis
Luyuan Xie, Manqing Lin, Tianyu Luan, Cong Li, Yuejian Fang, Qingni Shen, Zhonghai Wu
For the Misgendered Chinese in Gender Bias Research: Multi-Task Learning with Knowledge Distillation for Pinyin Name-Gender Prediction
Xiaocong Du, Haipeng Zhang
Semantic Objective Functions: A distribution-aware method for adding logical constraints in deep learning
Miguel Angel Mendez-Lucero, Enrique Bojorquez Gallardo, Vaishak Belle
Advancing Pre-trained Teacher: Towards Robust Feature Discrepancy for Anomaly Detection
Canhui Tang, Sanping Zhou, Yizhe Li, Yonghao Dong, Le Wang
Knowledge Distillation vs. Pretraining from Scratch under a Fixed (Computation) Budget
Minh Duc Bui, Fabian David Schmidt, Goran Glavaš, Katharina von der Wense
Why does Knowledge Distillation Work? Rethink its Attention and Fidelity Mechanism
Chenqi Guo, Shiwei Zhong, Xiaofeng Liu, Qianli Feng, Yinglong Ma
Correlation-Decoupled Knowledge Distillation for Multimodal Sentiment Analysis with Incomplete Modalities
Mingcheng Li, Dingkang Yang, Xiao Zhao, Shuaibing Wang, Yan Wang, Kun Yang, Mingyang Sun, Dongliang Kou, Ziyun Qian, Lihua Zhang
Dual Expert Distillation Network for Generalized Zero-Shot Learning
Zhijie Rao, Jingcai Guo, Xiaocheng Lu, Jingming Liang, Jie Zhang, Haozhao Wang, Kang Wei, Xiaofeng Cao