Strong Generalization
Strong generalization, the ability of machine learning models to perform well on unseen data, is a central objective in current research. Active areas of investigation include improving the robustness of self-supervised learning, understanding the optimization dynamics of transformers and other architectures (including CNNs and RNNs), and developing methods to enhance generalization through data augmentation, regularization techniques (e.g., logical regularization, consistency regularization), and improved training strategies (e.g., few-shot learning, meta-learning). These advancements are crucial for building reliable and adaptable AI systems across diverse applications, from image classification and natural language processing to healthcare and robotics.
Papers
On Influence Functions, Classification Influence, Relative Influence, Memorization and Generalization
Michael Kounavis, Ousmane Dia, Ilqar Ramazanli
SING: A Plug-and-Play DNN Learning Technique
Adrien Courtois, Damien Scieur, Jean-Michel Morel, Pablo Arias, Thomas Eboli
PDE+: Enhancing Generalization via PDE with Adaptive Distributional Diffusion
Yige Yuan, Bingbing Xu, Bo Lin, Liang Hou, Fei Sun, Huawei Shen, Xueqi Cheng
Sharpness-Aware Minimization Revisited: Weighted Sharpness as a Regularization Term
Yun Yue, Jiadi Jiang, Zhiling Ye, Ning Gao, Yongchao Liu, Ke Zhang
Theoretical Guarantees of Learning Ensembling Strategies with Applications to Time Series Forecasting
Hilaf Hasson, Danielle C. Maddix, Yuyang Wang, Gaurav Gupta, Youngsuk Park
Promoting Generalization in Cross-Dataset Remote Photoplethysmography
Nathan Vance, Jeremy Speth, Benjamin Sporrer, Patrick Flynn
Towards Reliable Misinformation Mitigation: Generalization, Uncertainty, and GPT-4
Kellin Pelrine, Anne Imouza, Camille Thibault, Meilina Reksoprodjo, Caleb Gupta, Joel Christoph, Jean-François Godbout, Reihaneh Rabbany
On the Generalization of Diffusion Model
Mingyang Yi, Jiacheng Sun, Zhenguo Li
Modeling rapid language learning by distilling Bayesian priors into artificial neural networks
R. Thomas McCoy, Thomas L. Griffiths
On progressive sharpening, flat minima and generalisation
Lachlan Ewen MacDonald, Jack Valmadre, Simon Lucey
Think Before You Act: Decision Transformers with Working Memory
Jikun Kang, Romain Laroche, Xingdi Yuan, Adam Trischler, Xue Liu, Jie Fu
Domain-Expanded ASTE: Rethinking Generalization in Aspect Sentiment Triplet Extraction
Yew Ken Chia, Hui Chen, Wei Han, Guizhen Chen, Sharifah Mahani Aljunied, Soujanya Poria, Lidong Bing
Federated Variational Inference: Towards Improved Personalization and Generalization
Elahe Vedadi, Joshua V. Dillon, Philip Andrew Mansfield, Karan Singhal, Arash Afkanpour, Warren Richard Morningstar
Towards understanding neural collapse in supervised contrastive learning with the information bottleneck method
Siwei Wang, Stephanie E Palmer
Generalizing to new geometries with Geometry-Aware Autoregressive Models (GAAMs) for fast calorimeter simulation
Junze Liu, Aishik Ghosh, Dylan Smith, Pierre Baldi, Daniel Whiteson