Strong Generalization
Strong generalization, the ability of machine learning models to perform well on unseen data, is a central objective in current research. Active areas of investigation include improving the robustness of self-supervised learning, understanding the optimization dynamics of transformers and other architectures (including CNNs and RNNs), and developing methods to enhance generalization through data augmentation, regularization techniques (e.g., logical regularization, consistency regularization), and improved training strategies (e.g., few-shot learning, meta-learning). These advancements are crucial for building reliable and adaptable AI systems across diverse applications, from image classification and natural language processing to healthcare and robotics.
Papers
The good, the bad and the ugly sides of data augmentation: An implicit spectral regularization perspective
Chi-Heng Lin, Chiraag Kaushik, Eva L. Dyer, Vidya Muthukumar
Everything is Varied: The Surprising Impact of Individual Variation on ML Robustness in Medicine
Andrea Campagner, Lorenzo Famiglini, Anna Carobene, Federico Cabitza
Bayesian Prompt Learning for Image-Language Model Generalization
Mohammad Mahdi Derakhshani, Enrique Sanchez, Adrian Bulat, Victor Guilherme Turrisi da Costa, Cees G. M. Snoek, Georgios Tzimiropoulos, Brais Martinez
The Calibration Generalization Gap
A. Michael Carrell, Neil Mallinar, James Lucas, Preetum Nakkiran
Tree Mover's Distance: Bridging Graph Metrics and Stability of Graph Neural Networks
Ching-Yao Chuang, Stefanie Jegelka
Neural-Symbolic Recursive Machine for Systematic Generalization
Qing Li, Yixin Zhu, Yitao Liang, Ying Nian Wu, Song-Chun Zhu, Siyuan Huang
A Study on the Efficiency and Generalization of Light Hybrid Retrievers
Man Luo, Shashank Jain, Anchit Gupta, Arash Einolghozati, Barlas Oguz, Debojeet Chatterjee, Xilun Chen, Chitta Baral, Peyman Heidari
On the optimization and generalization of overparameterized implicit neural networks
Tianxiang Gao, Hongyang Gao
Emergent Communication: Generalization and Overfitting in Lewis Games
Mathieu Rita, Corentin Tallec, Paul Michel, Jean-Bastien Grill, Olivier Pietquin, Emmanuel Dupoux, Florian Strub
Scale-invariant Bayesian Neural Networks with Connectivity Tangent Kernel
SungYub Kim, Sihwan Park, Kyungsu Kim, Eunho Yang