Strong Generalization
Strong generalization, the ability of machine learning models to perform well on unseen data, is a central objective in current research. Active areas of investigation include improving the robustness of self-supervised learning, understanding the optimization dynamics of transformers and other architectures (including CNNs and RNNs), and developing methods to enhance generalization through data augmentation, regularization techniques (e.g., logical regularization, consistency regularization), and improved training strategies (e.g., few-shot learning, meta-learning). These advancements are crucial for building reliable and adaptable AI systems across diverse applications, from image classification and natural language processing to healthcare and robotics.
Papers
Improving Generalization of Neural Vehicle Routing Problem Solvers Through the Lens of Model Architecture
Yubin Xiao, Di Wang, Xuan Wu, Yuesong Wu, Boyang Li, Wei Du, Liupu Wang, You Zhou
Investigating Pre-Training Objectives for Generalization in Vision-Based Reinforcement Learning
Donghu Kim, Hojoon Lee, Kyungmin Lee, Dongyoon Hwang, Jaegul Choo
Learning Divergence Fields for Shift-Robust Graph Representations
Qitian Wu, Fan Nie, Chenxiao Yang, Junchi Yan
Skill-aware Mutual Information Optimisation for Generalisation in Reinforcement Learning
Xuehui Yu, Mhairi Dunion, Xin Li, Stefano V. Albrecht
Confidence-aware Contrastive Learning for Selective Classification
Yu-Chang Wu, Shen-Huan Lyu, Haopu Shang, Xiangyu Wang, Chao Qian
Feature Contamination: Neural Networks Learn Uncorrelated Features and Fail to Generalize
Tianren Zhang, Chujie Zhao, Guanyu Chen, Yizhou Jiang, Feng Chen
Harder or Different? Understanding Generalization of Audio Deepfake Detection
Nicolas M. Müller, Nicholas Evans, Hemlata Tak, Philip Sperl, Konstantin Böttinger
Prediction-powered Generalization of Causal Inferences
Ilker Demirel, Ahmed Alaa, Anthony Philippakis, David Sontag
Representations as Language: An Information-Theoretic Framework for Interpretability
Henry Conklin, Kenny Smith
On the Limitations of Fractal Dimension as a Measure of Generalization
Charlie B. Tan, Inés García-Redondo, Qiquan Wang, Michael M. Bronstein, Anthea Monod
DNCs Require More Planning Steps
Yara Shamshoum, Nitzan Hodos, Yuval Sieradzki, Assaf Schuster
Verifying the Generalization of Deep Learning to Out-of-Distribution Domains
Guy Amir, Osher Maayan, Tom Zelazny, Guy Katz, Michael Schapira
What Improves the Generalization of Graph Transformers? A Theoretical Dive into the Self-attention and Positional Encoding
Hongkang Li, Meng Wang, Tengfei Ma, Sijia Liu, Zaixi Zhang, Pin-Yu Chen
Improving Generalization in Aerial and Terrestrial Mobile Robots Control Through Delayed Policy Learning
Ricardo B. Grando, Raul Steinmetz, Victor A. Kich, Alisson H. Kolling, Pablo M. Furik, Junior C. de Jesus, Bruna V. Guterres, Daniel T. Gamarra, Rodrigo S. Guerra, Paulo L. J. Drews-Jr
DEFT: Efficient Fine-Tuning of Diffusion Models by Learning the Generalised $h$-transform
Alexander Denker, Francisco Vargas, Shreyas Padhy, Kieran Didi, Simon Mathis, Vincent Dutordoir, Riccardo Barbano, Emile Mathieu, Urszula Julia Komorowska, Pietro Lio
Do Large Language Models Perform the Way People Expect? Measuring the Human Generalization Function
Keyon Vafa, Ashesh Rambachan, Sendhil Mullainathan