Self Knowledge Distillation

Self-knowledge distillation (SKD) is a machine learning technique that improves model performance by training a model to mimic its own behavior at different stages or under varied conditions, eliminating the need for a separate teacher model. Current research focuses on applying SKD across diverse model architectures, including convolutional neural networks (CNNs), transformers, and graph neural networks (GNNs), often incorporating techniques like adversarial learning, multi-source information fusion, and adaptive frequency masking to enhance knowledge transfer and generalization. The effectiveness of SKD in improving model accuracy, robustness, and efficiency across various applications, such as image classification, natural language processing, and medical image segmentation, makes it a significant area of ongoing research.

Papers