Strong Generalization
Strong generalization, the ability of machine learning models to perform well on unseen data, is a central objective in current research. Active areas of investigation include improving the robustness of self-supervised learning, understanding the optimization dynamics of transformers and other architectures (including CNNs and RNNs), and developing methods to enhance generalization through data augmentation, regularization techniques (e.g., logical regularization, consistency regularization), and improved training strategies (e.g., few-shot learning, meta-learning). These advancements are crucial for building reliable and adaptable AI systems across diverse applications, from image classification and natural language processing to healthcare and robotics.
Papers
Over-training with Mixup May Hurt Generalization
Zixuan Liu, Ziqiao Wang, Hongyu Guo, Yongyi Mao
The Double-Edged Sword of Implicit Bias: Generalization vs. Robustness in ReLU Networks
Spencer Frei, Gal Vardi, Peter L. Bartlett, Nathan Srebro
DSD$^2$: Can We Dodge Sparse Double Descent and Compress the Neural Network Worry-Free?
Victor Quétu, Enzo Tartaglione
nnUNet RASPP for Retinal OCT Fluid Detection, Segmentation and Generalisation over Variations of Data Sources
Nchongmaje Ndipenoch, Alina Miron, Zidong Wang, Yongmin Li
Explaining Generalization Power of a DNN Using Interactive Concepts
Huilin Zhou, Hao Zhang, Huiqi Deng, Dongrui Liu, Wen Shen, Shih-Han Chan, Quanshi Zhang
Progressive Ensemble Distillation: Building Ensembles for Efficient Inference
Don Kurian Dennis, Abhishek Shetty, Anish Sevekari, Kazuhito Koishida, Virginia Smith
Towards Unbounded Machine Unlearning
Meghdad Kurmanji, Peter Triantafillou, Jamie Hayes, Eleni Triantafillou
On the Stability and Generalization of Triplet Learning
Jun Chen, Hong Chen, Xue Jiang, Bin Gu, Weifu Li, Tieliang Gong, Feng Zheng
AdapterSoup: Weight Averaging to Improve Generalization of Pretrained Language Models
Alexandra Chronopoulou, Matthew E. Peters, Alexander Fraser, Jesse Dodge
A Modern Look at the Relationship between Sharpness and Generalization
Maksym Andriushchenko, Francesco Croce, Maximilian Müller, Matthias Hein, Nicolas Flammarion