Strong Generalization
Strong generalization, the ability of machine learning models to perform well on unseen data, is a central objective in current research. Active areas of investigation include improving the robustness of self-supervised learning, understanding the optimization dynamics of transformers and other architectures (including CNNs and RNNs), and developing methods to enhance generalization through data augmentation, regularization techniques (e.g., logical regularization, consistency regularization), and improved training strategies (e.g., few-shot learning, meta-learning). These advancements are crucial for building reliable and adaptable AI systems across diverse applications, from image classification and natural language processing to healthcare and robotics.
Papers
The Missing Margin: How Sample Corruption Affects Distance to the Boundary in ANNs
Marthinus W. Theunissen, Coenraad Mouton, Marelie H. Davel
Do Neural Networks Generalize from Self-Averaging Sub-classifiers in the Same Way As Adaptive Boosting?
Michael Sun, Peter Chatain
simpleKT: A Simple But Tough-to-Beat Baseline for Knowledge Tracing
Zitao Liu, Qiongqiong Liu, Jiahao Chen, Shuyan Huang, Weiqi Luo
A Theoretical Understanding of Shallow Vision Transformers: Learning, Generalization, and Sample Complexity
Hongkang Li, Meng Wang, Sijia Liu, Pin-yu Chen
Theory on Forgetting and Generalization of Continual Learning
Sen Lin, Peizhong Ju, Yingbin Liang, Ness Shroff
Koopman-based generalization bound: New aspect for full-rank weights
Yuka Hashimoto, Sho Sonoda, Isao Ishikawa, Atsushi Nitanda, Taiji Suzuki
Generalization in Graph Neural Networks: Improved PAC-Bayesian Bounds on Graph Diffusion
Haotian Ju, Dongyue Li, Aneesh Sharma, Hongyang R. Zhang
Feature Likelihood Divergence: Evaluating the Generalization of Generative Models Using Samples
Marco Jiralerspong, Avishek Joey Bose, Ian Gemp, Chongli Qin, Yoram Bachrach, Gauthier Gidel
The SSL Interplay: Augmentations, Inductive Bias, and Generalization
Vivien Cabannes, Bobak T. Kiani, Randall Balestriero, Yann LeCun, Alberto Bietti
Grounding Large Language Models in Interactive Environments with Online Reinforcement Learning
Thomas Carta, Clément Romac, Thomas Wolf, Sylvain Lamprier, Olivier Sigaud, Pierre-Yves Oudeyer
Domain Adaptation via Rebalanced Sub-domain Alignment
Yiling Liu, Juncheng Dong, Ziyang Jiang, Ahmed Aloui, Keyu Li, Hunter Klein, Vahid Tarokh, David Carlson
Generalizing to Unseen Elements: A Survey on Knowledge Extrapolation for Knowledge Graphs
Mingyang Chen, Wen Zhang, Yuxia Geng, Zezhong Xu, Jeff Z. Pan, Huajun Chen