Generalization Theory
Generalization theory in machine learning seeks to understand why and when models trained on a finite dataset perform well on unseen data. Current research focuses on developing tighter generalization bounds using information-theoretic approaches, analyzing the dynamics of specific training algorithms like stochastic gradient descent (SGD) and direct preference optimization (DPO), and exploring the role of model architecture in generalization, particularly for large language models and kernel regimes. These advancements aim to provide a more robust theoretical foundation for understanding the success of modern machine learning, leading to improved model design and more reliable predictions in various applications.
Papers
November 11, 2024
August 20, 2024
August 6, 2024
May 6, 2024
February 12, 2024
December 20, 2023
December 1, 2023
September 10, 2023
July 29, 2023
May 31, 2023