Strong Generalization
Strong generalization, the ability of machine learning models to perform well on unseen data, is a central objective in current research. Active areas of investigation include improving the robustness of self-supervised learning, understanding the optimization dynamics of transformers and other architectures (including CNNs and RNNs), and developing methods to enhance generalization through data augmentation, regularization techniques (e.g., logical regularization, consistency regularization), and improved training strategies (e.g., few-shot learning, meta-learning). These advancements are crucial for building reliable and adaptable AI systems across diverse applications, from image classification and natural language processing to healthcare and robotics.
Papers
Towards Understanding the Generalization of Medical Text-to-SQL Models and Datasets
Richard Tarbell, Kim-Kwang Raymond Choo, Glenn Dietrich, Anthony Rios
Revisiting DeepFool: generalization and improvement
Alireza Abdollahpourrostam, Mahed Abroshan, Seyed-Mohsen Moosavi-Dezfooli
Nowhere coexpanding functions
Andrew Cook, Andy Hammerlindl, Warwick Tucker
On the Expressiveness and Generalization of Hypergraph Neural Networks
Zhezheng Luo, Jiayuan Mao, Joshua B. Tenenbaum, Leslie Pack Kaelbling
Inversion dynamics of class manifolds in deep learning reveals tradeoffs underlying generalisation
Simone Ciceri, Lorenzo Cassani, Matteo Osella, Pietro Rotondo, Filippo Valle, Marco Gherardi