Strong Generalization
Strong generalization, the ability of machine learning models to perform well on unseen data, is a central objective in current research. Active areas of investigation include improving the robustness of self-supervised learning, understanding the optimization dynamics of transformers and other architectures (including CNNs and RNNs), and developing methods to enhance generalization through data augmentation, regularization techniques (e.g., logical regularization, consistency regularization), and improved training strategies (e.g., few-shot learning, meta-learning). These advancements are crucial for building reliable and adaptable AI systems across diverse applications, from image classification and natural language processing to healthcare and robotics.
Papers
In-context learning and Occam's razor
Eric Elmoznino, Tom Marty, Tejas Kasetty, Leo Gagnon, Sarthak Mittal, Mahan Fathi, Dhanya Sridhar, Guillaume Lajoie
Generalization for Least Squares Regression With Simple Spiked Covariances
Jiping Li, Rishi Sonthalia
Adversarial Testing as a Tool for Interpretability: Length-based Overfitting of Elementary Functions in Transformers
Patrik Zavoral, Dušan Variš, Ondřej Bojar
MathGAP: Out-of-Distribution Evaluation on Problems with Arbitrarily Complex Proofs
Andreas Opedal, Haruki Shirakami, Bernhard Schölkopf, Abulhair Saparov, Mrinmaya Sachan
FOOGD: Federated Collaboration for Both Out-of-distribution Generalization and Detection
Xinting Liao, Weiming Liu, Pengyang Zhou, Fengyuan Yu, Jiahe Xu, Jun Wang, Wenjie Wang, Chaochao Chen, Xiaolin Zheng
Cross-Dataset Generalization in Deep Learning
Xuyu Zhang, Haofan Huang, Dawei Zhang, Songlin Zhuang, Shensheng Han, Puxiang Lai, Honglin Liu
Neural networks that overcome classic challenges through practice
Kazuki Irie, Brenden M. Lake
MoTE: Reconciling Generalization with Specialization for Visual-Language to Video Knowledge Transfer
Minghao Zhu, Zhengpu Wang, Mengxian Hu, Ronghao Dang, Xiao Lin, Xun Zhou, Chengju Liu, Qijun Chen
The Implicit Bias of Structured State Space Models Can Be Poisoned With Clean Labels
Yonatan Slutzky, Yotam Alexander, Noam Razin, Nadav Cohen
Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late in Training
Zhanpeng Zhou, Mingze Wang, Yuchen Mao, Bingrui Li, Junchi Yan
Feedback Favors the Generalization of Neural ODEs
Jindou Jia, Zihan Yang, Meng Wang, Kexin Guo, Jianfei Yang, Xiang Yu, Lei Guo
LOBG:Less Overfitting for Better Generalization in Vision-Language Model
Chenhao Ding, Xinyuan Gao, Songlin Dong, Yuhang He, Qiang Wang, Alex Kot, Yihong Gong
Towards Bridging Generalization and Expressivity of Graph Neural Networks
Shouheng Li, Floris Geerts, Dongwoo Kim, Qing Wang
Low-Dimension-to-High-Dimension Generalization And Its Implications for Length Generalization
Yang Chen, Yitao Liang, Zhouchen Lin
Score Neural Operator: A Generative Model for Learning and Generalizing Across Multiple Probability Distributions
Xinyu Liao, Aoyang Qin, Jacob Seidman, Junqi Wang, Wei Wang, Paris Perdikaris
Deeper Insights into Deep Graph Convolutional Networks: Stability and Generalization
Guangrui Yang, Ming Li, Han Feng, Xiaosheng Zhuang