Strong Generalization
Strong generalization, the ability of machine learning models to perform well on unseen data, is a central objective in current research. Active areas of investigation include improving the robustness of self-supervised learning, understanding the optimization dynamics of transformers and other architectures (including CNNs and RNNs), and developing methods to enhance generalization through data augmentation, regularization techniques (e.g., logical regularization, consistency regularization), and improved training strategies (e.g., few-shot learning, meta-learning). These advancements are crucial for building reliable and adaptable AI systems across diverse applications, from image classification and natural language processing to healthcare and robotics.
Papers - Page 12
Scalable Out-of-distribution Robustness in the Presence of Unobserved Confounders
Parjanya Prashant, Seyedeh Baharan Khatami, Bruno Ribeiro, Babak SalimiImproving generalization of robot locomotion policies via Sharpness-Aware Reinforcement Learning
Severin Bochem, Eduardo Gonzalez-Sanchez, Yves Bicker, Gabriele FadiniLearned Random Label Predictions as a Neural Network Complexity Metric
Marlon Becker, Benjamin Risse
Differential learning kinetics govern the transition from memorization to generalization during in-context learning
Alex Nguyen, Gautam ReddyIKUN: Initialization to Keep snn training and generalization great with sUrrogate-stable variaNce
Da Chang, Deliang Wang, Xiao YangUsing different sources of ground truths and transfer learning to improve the generalization of photometric redshift estimation
Jonathan Soriano, Srinath Saikrishnan, Vikram Seenivasan, Bernie Boscoe, Jack Singal, Tuan DoVerbalized Representation Learning for Interpretable Few-Shot Generalization
Cheng-Fu Yang, Da Yin, Wenbo Hu, Nanyun Peng, Bolei Zhou, Kai-Wei Chang
MLDGG: Meta-Learning for Domain Generalization on Graphs
Qin Tian, Chen Zhao, Minglai Shao, Wenjun Wang, Yujie Lin, Dong LiProcedural Knowledge in Pretraining Drives Reasoning in Large Language Models
Laura Ruis, Maximilian Mozes, Juhan Bae, Siddhartha Rao Kamalakara, Dwarak Talupuru, Acyr Locatelli, Robert Kirk, Tim Rocktäschel+2