ReLU Layer
ReLU (Rectified Linear Unit) layers are a fundamental component of many neural networks, primarily used for introducing non-linearity and enabling the learning of complex patterns. Current research focuses on understanding ReLU's theoretical properties, including its impact on network injectivity, expressivity, and approximation capabilities, often within the context of specific architectures like convolutional neural networks and transformers. This research aims to improve training efficiency, enhance model interpretability, and address challenges such as "dying ReLU" and computational cost in various applications, from image classification to reinforcement learning. The findings contribute to a deeper understanding of neural network behavior and inform the design of more efficient and effective deep learning models.
Papers
Network Degeneracy as an Indicator of Training Performance: Comparing Finite and Infinite Width Angle Predictions
Cameron Jakub, Mihai Nica
ReLU to the Rescue: Improve Your On-Policy Actor-Critic with Positive Advantages
Andrew Jesson, Chris Lu, Gunshi Gupta, Nicolas Beltran-Velez, Angelos Filos, Jakob Nicolaus Foerster, Yarin Gal