Paper ID: 2109.06458
A Note on Knowledge Distillation Loss Function for Object Classification
Defang Chen
This research note provides a quick introduction to the knowledge distillation loss function used in object classification. In particular, we discuss its connection to a previously proposed logits matching loss function. We further treat knowledge distillation as a specific form of output regularization and demonstrate its connection to label smoothing and entropy-based regularization.
Submitted: Sep 14, 2021