Two Layer ReLU
Two-layer ReLU networks, a foundational model in deep learning, are the subject of intense research aimed at understanding their optimization landscape and generalization capabilities. Current work focuses on analyzing the implicit biases of gradient-based optimization algorithms, exploring the role of initialization and learning rate, and developing convex relaxations to improve training efficiency and provide theoretical guarantees. These investigations are crucial for advancing our understanding of neural network training dynamics and for developing more robust and efficient machine learning models with provable properties.
Papers
November 11, 2024
November 4, 2024
October 29, 2024
October 24, 2024
October 10, 2024
October 8, 2024
October 3, 2024
July 5, 2024
June 10, 2024
June 5, 2024
May 30, 2024
April 9, 2024
February 6, 2024
November 18, 2023
October 29, 2023
October 11, 2023
October 9, 2023
July 28, 2023