Strong Generalization
Strong generalization, the ability of machine learning models to perform well on unseen data, is a central objective in current research. Active areas of investigation include improving the robustness of self-supervised learning, understanding the optimization dynamics of transformers and other architectures (including CNNs and RNNs), and developing methods to enhance generalization through data augmentation, regularization techniques (e.g., logical regularization, consistency regularization), and improved training strategies (e.g., few-shot learning, meta-learning). These advancements are crucial for building reliable and adaptable AI systems across diverse applications, from image classification and natural language processing to healthcare and robotics.
Papers
SAR: Generalization of Physiological Agility and Dexterity via Synergistic Action Representation
Cameron Berg, Vittorio Caggiano, Vikash Kumar
Decomposing the Generalization Gap in Imitation Learning for Visual Robotic Manipulation
Annie Xie, Lisa Lee, Ted Xiao, Chelsea Finn
Stability and Generalization of Stochastic Compositional Gradient Descent Algorithms
Ming Yang, Xiyuan Wei, Tianbao Yang, Yiming Ying
GIT: Detecting Uncertainty, Out-Of-Distribution and Adversarial Samples using Gradients and Invariance Transformations
Julia Lust, Alexandru P. Condurache
Jailbroken: How Does LLM Safety Training Fail?
Alexander Wei, Nika Haghtalab, Jacob Steinhardt
FAM: Relative Flatness Aware Minimization
Linara Adilova, Amr Abourayya, Jianning Li, Amin Dada, Henning Petzka, Jan Egger, Jens Kleesiek, Michael Kamp