Model Complexity
Model complexity in machine learning focuses on understanding the relationship between a model's size, structure, and its performance, aiming to optimize for accuracy while minimizing resource consumption and improving interpretability. Current research investigates this relationship across diverse model architectures, including transformers, mixtures-of-experts, and various neural network types, employing techniques like feature engineering, model pruning, and novel complexity metrics beyond simple parameter counts. These efforts are crucial for advancing both theoretical understanding of generalization and practical applications, particularly in resource-constrained environments and safety-critical domains where model interpretability is paramount.
Papers
Differentiation and Specialization of Attention Heads via the Refined Local Learning Coefficient
George Wang, Jesse Hoogland, Stan van Wingerden, Zach Furman, Daniel Murfet
MANTRA: The Manifold Triangulations Assemblage
Rubén Ballester, Ernst Röell, Daniel Bin Schmid, Mathieu Alain, Sergio Escalera, Carles Casacuberta, Bastian Rieck