Strong Generalization
Strong generalization, the ability of machine learning models to perform well on unseen data, is a central objective in current research. Active areas of investigation include improving the robustness of self-supervised learning, understanding the optimization dynamics of transformers and other architectures (including CNNs and RNNs), and developing methods to enhance generalization through data augmentation, regularization techniques (e.g., logical regularization, consistency regularization), and improved training strategies (e.g., few-shot learning, meta-learning). These advancements are crucial for building reliable and adaptable AI systems across diverse applications, from image classification and natural language processing to healthcare and robotics.
Papers
GUS-Net: Social Bias Classification in Text with Generalizations, Unfairness, and Stereotypes
Maximus Powers, Hua Wei, Umang Mavani, Harshitha Reddy Jonala, Ansh Tiwari
Dynamics of Concept Learning and Compositional Generalization
Yongyi Yang, Core Francisco Park, Ekdeep Singh Lubana, Maya Okawa, Wei Hu, Hidenori Tanaka
Generalization from Starvation: Hints of Universality in LLM Knowledge Graph Learning
David D. Baek, Yuxiao Li, Max Tegmark
A Generalization Bound for a Family of Implicit Networks
Samy Wu Fung, Benjamin Berkels
Continual Learning: Less Forgetting, More OOD Generalization via Adaptive Contrastive Replay
Hossein Rezaei, Mohammad Sabokrou
Emergent properties with repeated examples
François Charton, Julia Kempe
Tri-Level Navigator: LLM-Empowered Tri-Level Learning for Time Series OOD Generalization
Chengtao Jian, Kai Yang, Yang Jiao
Fill In The Gaps: Model Calibration and Generalization with Synthetic Data
Yang Ba, Michelle V. Mancenido, Rong Pan
Failure-Proof Non-Contrastive Self-Supervised Learning
Emanuele Sansone, Tim Lebailly, Tinne Tuytelaars
On the Optimization and Generalization of Two-layer Transformers with Sign Gradient Descent
Bingrui Li, Wei Huang, Andi Han, Zhanpeng Zhou, Taiji Suzuki, Jun Zhu, Jianfei Chen
$\textbf{Only-IF}$:Revealing the Decisive Effect of Instruction Diversity on Generalization
Dylan Zhang, Justin Wang, Francois Charton
Provable Weak-to-Strong Generalization via Benign Overfitting
David X. Wu, Anant Sahai
EnsemW2S: Can an Ensemble of LLMs be Leveraged to Obtain a Stronger LLM?
Aakriti Agrawal, Mucong Ding, Zora Che, Chenghao Deng, Anirudh Satheesh, John Langford, Furong Huang
Interpret Your Decision: Logical Reasoning Regularization for Generalization in Visual Classification
Zhaorui Tan, Xi Yang, Qiufeng Wang, Anh Nguyen, Kaizhu Huang
Grokking at the Edge of Linear Separability
Alon Beck, Noam Levi, Yohai Bar-Sinai
DiffusionFake: Enhancing Generalization in Deepfake Detection via Guided Stable Diffusion
Ke Sun, Shen Chen, Taiping Yao, Hong Liu, Xiaoshuai Sun, Shouhong Ding, Rongrong Ji