Deep Learning Theory
Deep learning theory seeks to understand the mathematical principles underlying the remarkable success of deep neural networks. Current research focuses on explaining generalization ability, particularly in overparameterized models, investigating the role of inductive biases in various architectures (like MLPs, Transformers, and Graph Neural Networks), and analyzing training dynamics through lenses such as the neural tangent kernel. These theoretical advancements are crucial for improving the design, training, and interpretability of deep learning models, ultimately leading to more robust and reliable applications across diverse scientific and engineering domains.
Papers
June 2, 2022
March 1, 2022
January 28, 2022
January 26, 2022
January 21, 2022
December 26, 2021
December 17, 2021