Strong Generalization
Strong generalization, the ability of machine learning models to perform well on unseen data, is a central objective in current research. Active areas of investigation include improving the robustness of self-supervised learning, understanding the optimization dynamics of transformers and other architectures (including CNNs and RNNs), and developing methods to enhance generalization through data augmentation, regularization techniques (e.g., logical regularization, consistency regularization), and improved training strategies (e.g., few-shot learning, meta-learning). These advancements are crucial for building reliable and adaptable AI systems across diverse applications, from image classification and natural language processing to healthcare and robotics.
Papers
PAGE: Equilibrate Personalization and Generalization in Federated Learning
Qian Chen, Zilong Wang, Jiaqi Hu, Haonan Yan, Jianying Zhou, Xiaodong Lin
Adaptivity and Modularity for Efficient Generalization Over Task Complexity
Samira Abnar, Omid Saremi, Laurent Dinh, Shantel Wilson, Miguel Angel Bautista, Chen Huang, Vimal Thilak, Etai Littwin, Jiatao Gu, Josh Susskind, Samy Bengio
Feature Learning and Generalization in Deep Networks with Orthogonal Weights
Hannah Day, Yonatan Kahn, Daniel A. Roberts
Why Does Sharpness-Aware Minimization Generalize Better Than SGD?
Zixiang Chen, Junkai Zhang, Yiwen Kou, Xiangning Chen, Cho-Jui Hsieh, Quanquan Gu
Quantifying Agent Interaction in Multi-agent Reinforcement Learning for Cost-efficient Generalization
Yuxin Chen, Chen Tang, Ran Tian, Chenran Li, Jinning Li, Masayoshi Tomizuka, Wei Zhan
Robustness May be More Brittle than We Think under Different Degrees of Distribution Shifts
Kaican Li, Yifan Zhang, Lanqing Hong, Zhenguo Li, Nevin L. Zhang
Understanding the Effects of RLHF on LLM Generalisation and Diversity
Robert Kirk, Ishita Mediratta, Christoforos Nalmpantis, Jelena Luketina, Eric Hambro, Edward Grefenstette, Roberta Raileanu
Mitigating Simplicity Bias in Deep Learning for Improved OOD Generalization and Robustness
Bhavya Vasudeva, Kameron Shahabi, Vatsal Sharan
Grokking as the Transition from Lazy to Rich Training Dynamics
Tanishq Kumar, Blake Bordelon, Samuel J. Gershman, Cengiz Pehlevan
Grokking as Compression: A Nonlinear Complexity Perspective
Ziming Liu, Ziqian Zhong, Max Tegmark
A Generalization Bound of Deep Neural Networks for Dependent Data
Quan Huu Do, Binh T. Nguyen, Lam Si Tung Ho
Causal Reasoning through Two Layers of Cognition for Improving Generalization in Visual Question Answering
Trang Nguyen, Naoaki Okazaki
What do larger image classifiers memorise?
Michal Lukasik, Vaishnavh Nagarajan, Ankit Singh Rawat, Aditya Krishna Menon, Sanjiv Kumar