Strong Generalization
Strong generalization, the ability of machine learning models to perform well on unseen data, is a central objective in current research. Active areas of investigation include improving the robustness of self-supervised learning, understanding the optimization dynamics of transformers and other architectures (including CNNs and RNNs), and developing methods to enhance generalization through data augmentation, regularization techniques (e.g., logical regularization, consistency regularization), and improved training strategies (e.g., few-shot learning, meta-learning). These advancements are crucial for building reliable and adaptable AI systems across diverse applications, from image classification and natural language processing to healthcare and robotics.
Papers
Hierarchical Prompt Decision Transformer: Improving Few-Shot Policy Generalization with Global and Adaptive Guidance
Zhe Wang, Haozhu Wang, Yanjun Qi
Learning on Less: Constraining Pre-trained Model Learning for Generalizable Diffusion-Generated Image Detection
Yingjian Chen, Lei Zhang, Yakun Niu, Lei Tan, Pei Chen
Scalable Out-of-distribution Robustness in the Presence of Unobserved Confounders
Parjanya Prashant, Seyedeh Baharan Khatami, Bruno Ribeiro, Babak Salimi
Improving generalization of robot locomotion policies via Sharpness-Aware Reinforcement Learning
Severin Bochem, Eduardo Gonzalez-Sanchez, Yves Bicker, Gabriele Fadini
Learned Random Label Predictions as a Neural Network Complexity Metric
Marlon Becker, Benjamin Risse
Differential learning kinetics govern the transition from memorization to generalization during in-context learning
Alex Nguyen, Gautam Reddy
IKUN: Initialization to Keep snn training and generalization great with sUrrogate-stable variaNce
Da Chang, Deliang Wang, Xiao Yang
Using different sources of ground truths and transfer learning to improve the generalization of photometric redshift estimation
Jonathan Soriano, Srinath Saikrishnan, Vikram Seenivasan, Bernie Boscoe, Jack Singal, Tuan Do
Verbalized Representation Learning for Interpretable Few-Shot Generalization
Cheng-Fu Yang, Da Yin, Wenbo Hu, Nanyun Peng, Bolei Zhou, Kai-Wei Chang
From memorization to generalization: a theoretical framework for diffusion-based generative models
Indranil Halder
On the Generalization of Handwritten Text Recognition Models
Carlos Garrido-Munoz, Jorge Calvo-Zaragoza
An In-depth Investigation of Sparse Rate Reduction in Transformer-like Models
Yunzhe Hu, Difan Zou, Dong Xu
Aligning Generalisation Between Humans and Machines
Filip Ilievski, Barbara Hammer, Frank van Harmelen, Benjamin Paassen, Sascha Saralajew, Ute Schmid, Michael Biehl, Marianna Bolognesi, Xin Luna Dong, Kiril Gashteovski, Pascal Hitzler, Giuseppe Marra, Pasquale Minervini, Martin Mundt, Axel-Cyrille Ngonga Ngomo, Alessandro Oltramari, Gabriella Pasi, Zeynep G. Saribatur, Luciano Serafini, John Shawe-Taylor, Vered Shwartz, Gabriella Skitalinskaya, Clemens Stachl, Gido M. van de Ven, Thomas Villmann
MUNBa: Machine Unlearning via Nash Bargaining
Jing Wu, Mehrtash Harandi
MLDGG: Meta-Learning for Domain Generalization on Graphs
Qin Tian, Chen Zhao, Minglai Shao, Wenjun Wang, Yujie Lin, Dong Li
Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models
Laura Ruis, Maximilian Mozes, Juhan Bae, Siddhartha Rao Kamalakara, Dwarak Talupuru, Acyr Locatelli, Robert Kirk, Tim Rocktäschel, Edward Grefenstette, Max Bartolo