Online Knowledge Distillation

Online knowledge distillation (OKD) is a machine learning technique that improves the efficiency and performance of smaller "student" models by training them collaboratively with larger "teacher" models, eliminating the need for a pre-trained teacher. Current research focuses on optimizing knowledge transfer mechanisms, particularly addressing challenges like model homogenization and efficient knowledge representation across various architectures, including convolutional neural networks (CNNs), vision transformers (ViTs), and graph neural networks (GNNs), often employing techniques like contrastive learning and attention mechanisms. OKD's significance lies in its potential to reduce computational costs and improve the performance of models deployed on resource-constrained devices, impacting fields ranging from computer vision and natural language processing to reinforcement learning and personalized education.

Papers

July 14, 2022

Large-scale Knowledge Distillation with Elastic Heterogeneous Computing Resources
Ji Liu, Daxiang Dong, Xi Wang, An Qin, Xingjian Li, Patrick Valduriez, Dejing Dou, Dianhai Yu
Knowledge Distillation Heterogeneous Computing Online Knowledge Distillation Elastic Net

July 10, 2022

1st Place Solution to the EPIC-Kitchens Action Anticipation Challenge 2022
Zeyu Jiang, Changxing Ding
Place Solution Action Anticipation Early Slavic Participle Online Knowledge Distillation Nominal Phrase

June 24, 2022

Mixed Sample Augmentation for Online Distillation
Yiqing Shen, Liwu Xu, Yuzhe Yang, Yaqian Li, Yandong Guo
Knowledge Distillation Online Knowledge Distillation Online Distillation Mixed Sample Data Augmentation

June 16, 2022

Multi scale Feature Extraction and Fusion for Online Knowledge Distillation
Panpan Zou, Yinglei Teng, Tao Niu
Hybrid Fusion Knowledge Representation Multi Scale Feature Multi Scale Representation Online Knowledge Distillation

May 5, 2022

Alignahead: Online Cross-Layer Knowledge Extraction on Graph Neural Networks
Jiongyu Guo, Defang Chen, Can Wang
Graph Neural Network LD Align Online Knowledge Distillation Joint Distillation Student GNN

March 22, 2022

Channel Self-Supervision for Online Knowledge Distillation
Shixiao Fan, Xuan Cheng, Xiaomin Wang, Chun Yang, Pan Deng, Minghui Liu, Jiali Deng, Ming Liu
Mutual Distillation Generalization Capability Differentiable Channel Online Knowledge Distillation Collaborative Model

March 8, 2022

Overcoming Catastrophic Forgetting beyond Continual Learning: Balanced Training for Neural Machine Translation
Chenze Shao, Yang Feng
Continual LEArning Catastrophic Forgetting Neural Machine Translation Neural Machine Translation Model Online Knowledge Distillation

November 23, 2021

Semi-Online Knowledge Distillation
Zhiqiang Liu, Yanxia Liu, Chengkai Huang
Knowledge Distillation Mutual Learning Online Knowledge Distillation Conventional Knowledge Distillation