Two Layer ReLU

Two-layer ReLU networks, a foundational model in deep learning, are the subject of intense research aimed at understanding their optimization landscape and generalization capabilities. Current work focuses on analyzing the implicit biases of gradient-based optimization algorithms, exploring the role of initialization and learning rate, and developing convex relaxations to improve training efficiency and provide theoretical guarantees. These investigations are crucial for advancing our understanding of neural network training dynamics and for developing more robust and efficient machine learning models with provable properties.

Papers