Strong Generalization
Strong generalization, the ability of machine learning models to perform well on unseen data, is a central objective in current research. Active areas of investigation include improving the robustness of self-supervised learning, understanding the optimization dynamics of transformers and other architectures (including CNNs and RNNs), and developing methods to enhance generalization through data augmentation, regularization techniques (e.g., logical regularization, consistency regularization), and improved training strategies (e.g., few-shot learning, meta-learning). These advancements are crucial for building reliable and adaptable AI systems across diverse applications, from image classification and natural language processing to healthcare and robotics.
Papers
Deep Companion Learning: Enhancing Generalization Through Historical Consistency
Ruizhao Zhu, Venkatesh Saligrama
Towards A Generalizable Pathology Foundation Model via Unified Knowledge Distillation
Jiabo Ma, Zhengrui Guo, Fengtao Zhou, Yihui Wang, Yingxue Xu, Yu Cai, Zhengjie Zhu, Cheng Jin, Yi Lin, Xinrui Jiang, Anjia Han, Li Liang, Ronald Cheong Kin Chan, Jiguang Wang, Kwang-Ting Cheng, Hao Chen
Relating the Seemingly Unrelated: Principled Understanding of Generalization for Generative Models in Arithmetic Reasoning Tasks
Xingcheng Xu, Zibo Zhao, Haipeng Zhang, Yanqing Yang
DualFed: Enjoying both Generalization and Personalization in Federated Learning via Hierachical Representations
Guogang Zhu, Xuefeng Liu, Jianwei Niu, Shaojie Tang, Xinghao Wu, Jiayuan Zhang
Fractional signature: a generalisation of the signature inspired by fractional calculus
José Manuel Corcuera, Rubén Jiménez
Gradient-based inference of abstract task representations for generalization in neural networks
Ali Hummos, Felipe del Río, Brabeeba Mien Wang, Julio Hurtado, Cristian B. Calderon, Guangyu Robert Yang
Baba Is AI: Break the Rules to Beat the Benchmark
Nathan Cloos, Meagan Jens, Michelangelo Naim, Yen-Ling Kuo, Ignacio Cases, Andrei Barbu, Christopher J. Cueva
Instance Selection for Dynamic Algorithm Configuration with Reinforcement Learning: Improving Generalization
Carolin Benjamins, Gjorgjina Cenikj, Ana Nikolikj, Aditya Mohan, Tome Eftimov, Marius Lindauer
CycleMix: Mixing Source Domains for Domain Generalization in Style-Dependent Data
Aristotelis Ballas, Christos Diou