Self Distillation

Self-distillation is a machine learning technique where a model learns from its own predictions, improving performance and efficiency without requiring a separate teacher model. Current research focuses on applying self-distillation to diverse tasks and model architectures, including spiking neural networks, transformers, and various deep learning models for image, point cloud, and natural language processing. This approach is particularly valuable for resource-constrained environments, enabling model compression and improved performance in scenarios with limited data or computational power, impacting fields like robotics, medical imaging, and natural language understanding.

Papers

March 2, 2023

Learning From Yourself: A Self-Distillation Method for Fake Speech Detection
Jun Xue, Cunhang Fan, Jiangyan Yi, Chenglong Wang, Zhengqi Wen, Dan Zhang, Zhao Lv
Self Distillation Shallow Network Fake Speech Detection

February 27, 2023

Self Correspondence Distillation for End-to-End Weakly-Supervised Semantic Segmentation
Rongtao Xu, Changwei Wang, Jiaxi Sun, Shibiao Xu, Weiliang Meng, Xiaopeng Zhang
Semantic Segmentation End to End Pseudo Label Self Distillation Image Level Label Pixel Variance

February 23, 2023

Random Teachers are Good Teachers
Felix Sarnthein, Gregor Bachmann, Sotiris Anagnostidis, Thomas Hofmann
Self Supervised Learning Self Distillation Good Teacher Implicit Regularization Loss Landscape Multiple Teacher Gradient Dynamic

February 20, 2023

Self-Supervised Monocular Depth Estimation with Self-Reference Distillation and Disparity Offset Refinement
Zhong Liu, Ran Li, Shuwei Shao, Xingming Wu, Weihai Chen
Monocular Depth Estimation Self Distillation Self Supervised Monocular Depth Estimation Ground Truth Depth Disparity Offset

February 14, 2023

Robust Representation Learning with Self-Distillation for Domain Generalization
Ankur Singh, Senthilnath Jayavelu
Vision Transformer Domain Generalization Self Distillation Robust Representation Learning

February 11, 2023

Improving Differentiable Architecture Search via Self-Distillation
Xunyu Zhu, Jian Li, Yong Liu, Weiping Wang
Neural Architecture Search Self Distillation Differentiable Architecture Search Optimal Architecture

February 2, 2023

Paced-Curriculum Distillation with Prediction and Label Uncertainty for Image Segmentation
Mobarakol Islam, Lalithkumar Seenivasan, S. P. Sharan, V. K. Viekash, Bhavesh Gupta, Ben Glocker, Hongliang Ren
Human Prediction Image Segmentation Curriculum Learning Self Distillation Label Uncertainty Self Paced Learning Curriculum Distillation

January 30, 2023

Understanding Self-Distillation in the Presence of Label Noise
Rudrajit Das, Sujay Sanghavi
Loss Function Label Noise Speech Presence Cross Entropy Loss Self Distillation Supervised Learning Problem

January 27, 2023

Streaming LifeLong Learning With Any-Time Inference
Soumya Banerjee, Vinay Kumar Verma, Vinay P. Namboodiri
Continual LEArning Catastrophic Forgetting Scientific Inference Self Distillation Lifelong Learning

December 28, 2022

TiG-BEV: Multi-view BEV 3D Object Detection via Target Inner-Geometry Learning
Peixiang Huang, Li Liu, Renrui Zhang, Song Zhang, Xinli Xu, Baichao Wang, Guoyi Liu
3D Object Self Distillation Bird'S Eye View Eye View Geometric Learning Prior Distillation Eye View 3D Object Detection

December 6, 2022

Self-Supervised Audio-Visual Speech Representations Learning By Multimodal Self-Distillation
Jing-Xuan Zhang, Genshun Wan, Zhen-Hua Ling, Jia Pan, Jianqing Gao, Cong Liu
Self Distillation Masked Language Multitask Learning Audio Visual Speech Representation

December 5, 2022

HierarchyFL: Heterogeneous Federated Learning via Hierarchical Self-Distillation
Jun Xia, Yi Zhang, Zhihao Yue, Ming Hu, Xian Wei, Mingsong Chen
Self Distillation Part Whole Hierarchy Heterogeneous Federated Learning Model Aggregation Heterogeneous Model Heterogeneous Artificial Intelligence

November 14, 2022

November 1, 2022

SADT: Combining Sharpness-Aware Minimization with Self-Distillation for Improved Model Generalization
Masud An-Nur Islam Fahim, Jani Boutellier
Self Distillation Sharpness Aware Minimization Model Generalization Model Generalizability Teacher Student Distillation

October 29, 2022

Better Lightweight Network for Free: Codeword Mimic Learning for Massive MIMO CSI feedback
Zhilin Lu, Xudong Zhang, Rui Zeng, Jintao Wang
Neural Network Self Distillation lightWeight Network Massive Multiple Input Multiple Output Feedback Network

October 27, 2022

QUILL: Query Intent with Large Language Models using Retrieval Augmentation and Multi-stage Distillation
Krishna Srinivasan, Karthik Raman, Anupam Samanta, Lingrui Liao, Luca Bertelli, Mike Bendersky
Large Language Model Knowledge Distillation Query Information Retrieval Augmented Self Distillation Retrieval Augmentation Pen Like Object

October 26, 2022

Fast Yet Effective Speech Emotion Recognition with Self-distillation
Zhao Ren, Thanh Tam Nguyen, Yi Chang, Björn W. Schuller
Speech Emotion Recognition Speech Data Self Distillation Large Scale Speech

October 17, 2022

Distilling Object Detectors With Global Knowledge
Sanli Tang, Zhongyu Zhang, Zhanzhan Cheng, Jing Lu, Yunlu Xu, Yi Niu, Fan He
Knowledge Distillation Self Distillation Detection Task World Knowledge 3D Object Detection Distillation Algorithm Distillation

October 15, 2022

Self-Distillation for Unsupervised 3D Domain Adaptation
Adriano Cardace, Riccardo Spezialetti, Pierluigi Zama Ramirez, Samuele Salti, Luigi Di Stefano
Point Cloud Domain Adaptation Self Distillation Point Cloud Classification Point Cloud Reconstruction 3D Domain Adaptation

Self Distillation

Papers

Learning From Yourself: A Self-Distillation Method for Fake Speech Detection

Self Correspondence Distillation for End-to-End Weakly-Supervised Semantic Segmentation

Random Teachers are Good Teachers

Self-Supervised Monocular Depth Estimation with Self-Reference Distillation and Disparity Offset Refinement

Robust Representation Learning with Self-Distillation for Domain Generalization

Improving Differentiable Architecture Search via Self-Distillation

Paced-Curriculum Distillation with Prediction and Label Uncertainty for Image Segmentation

Understanding Self-Distillation in the Presence of Label Noise

Streaming LifeLong Learning With Any-Time Inference

TiG-BEV: Multi-view BEV 3D Object Detection via Target Inner-Geometry Learning

Self-Supervised Audio-Visual Speech Representations Learning By Multimodal Self-Distillation

HierarchyFL: Heterogeneous Federated Learning via Hierarchical Self-Distillation

Self-distillation with Online Diffusion on Batch Manifolds Improves Deep Metric Learning

Structured Knowledge Distillation Towards Efficient and Compact Multi-View 3D Detection

SADT: Combining Sharpness-Aware Minimization with Self-Distillation for Improved Model Generalization

Better Lightweight Network for Free: Codeword Mimic Learning for Massive MIMO CSI feedback

QUILL: Query Intent with Large Language Models using Retrieval Augmentation and Multi-stage Distillation

Fast Yet Effective Speech Emotion Recognition with Self-distillation

Distilling Object Detectors With Global Knowledge

Self-Distillation for Unsupervised 3D Domain Adaptation