Strong Generalization
Strong generalization, the ability of machine learning models to perform well on unseen data, is a central objective in current research. Active areas of investigation include improving the robustness of self-supervised learning, understanding the optimization dynamics of transformers and other architectures (including CNNs and RNNs), and developing methods to enhance generalization through data augmentation, regularization techniques (e.g., logical regularization, consistency regularization), and improved training strategies (e.g., few-shot learning, meta-learning). These advancements are crucial for building reliable and adaptable AI systems across diverse applications, from image classification and natural language processing to healthcare and robotics.
Papers
Deep Neural Networks as the Semi-classical Limit of Topological Quantum Neural Networks: The problem of generalisation
Antonino Marciano, Deen Chen, Filippo Fabrocini, Chris Fields, Matteo Lulli, Emanuele Zappala
Pruning's Effect on Generalization Through the Lens of Training and Regularization
Tian Jin, Michael Carbin, Daniel M. Roy, Jonathan Frankle, Gintare Karolina Dziugaite
Evaluating Parameter Efficient Learning for Generation
Peng Xu, Mostofa Patwary, Shrimai Prabhumoye, Virginia Adams, Ryan J. Prenger, Wei Ping, Nayeon Lee, Mohammad Shoeybi, Bryan Catanzaro
Just Mix Once: Worst-group Generalization by Group Interpolation
Giorgio Giannone, Serhii Havrylov, Jordan Massiah, Emine Yilmaz, Yunlong Jiao
A Non-Asymptotic Moreau Envelope Theory for High-Dimensional Generalized Linear Models
Lijia Zhou, Frederic Koehler, Pragya Sur, Danica J. Sutherland, Nathan Srebro
Generalization with Lossy Affordances: Leveraging Broad Offline Data for Learning Visuomotor Tasks
Kuan Fang, Patrick Yin, Ashvin Nair, Homer Walke, Gengchen Yan, Sergey Levine
CTL++: Evaluating Generalization on Never-Seen Compositional Patterns of Known Functions, and Compatibility of Neural Representations
Róbert Csordás, Kazuki Irie, Jürgen Schmidhuber
Contrastive Retrospection: honing in on critical steps for rapid learning and generalization in RL
Chen Sun, Wannan Yang, Thomas Jiralerspong, Dane Malenfant, Benjamin Alsbury-Nealy, Yoshua Bengio, Blake Richards