Paper ID: 2109.06458

A Note on Knowledge Distillation Loss Function for Object Classification

Defang Chen

This research note provides a quick introduction to the knowledge distillation loss function used in object classification. In particular, we discuss its connection to a previously proposed logits matching loss function. We further treat knowledge distillation as a specific form of output regularization and demonstrate its connection to label smoothing and entropy-based regularization.

Submitted: Sep 14, 2021