Paper ID: 2205.03940 • Published May 8, 2022
Investigating Generalization by Controlling Normalized Margin
Alexander R. Farhang, Jeremy Bernstein, Kushal Tirumala, Yang Liu, Yisong Yue
TL;DR
Get AI-generated summaries with premium
Get AI-generated summaries with premium
Weight norm \|w\| and margin γ participate in learning theory via the normalized margin γ/\|w\|. Since standard neural net optimizers do not control normalized margin, it is hard to test whether this quantity causally relates to generalization. This paper designs a series of experimental studies that explicitly control normalized margin and thereby tackle two central questions. First: does normalized margin always have a causal effect on generalization? The paper finds that no -- networks can be produced where normalized margin has seemingly no relationship with generalization, counter to the theory of Bartlett et al. (2017). Second: does normalized margin ever have a causal effect on generalization? The paper finds that yes -- in a standard training setup, test performance closely tracks normalized margin. The paper suggests a Gaussian process model as a promising explanation for this behavior.