Deep Learning Theory
Deep learning theory seeks to understand the mathematical principles underlying the remarkable success of deep neural networks. Current research focuses on explaining generalization ability, particularly in overparameterized models, investigating the role of inductive biases in various architectures (like MLPs, Transformers, and Graph Neural Networks), and analyzing training dynamics through lenses such as the neural tangent kernel. These theoretical advancements are crucial for improving the design, training, and interpretability of deep learning models, ultimately leading to more robust and reliable applications across diverse scientific and engineering domains.
Papers
December 6, 2024
August 25, 2024
July 25, 2024
July 4, 2024
May 3, 2024
April 25, 2024
March 19, 2024
January 14, 2024
October 19, 2023
October 6, 2023
September 14, 2023
June 30, 2023
June 23, 2023
April 3, 2023
March 23, 2023
March 1, 2023
December 29, 2022
October 21, 2022
September 15, 2022