Swish Activation

Swish activation functions, a class of non-monotonic activation functions for neural networks, aim to improve model performance and efficiency compared to traditional alternatives like ReLU. Current research focuses on enhancing Swish's properties through modifications like incorporating Tanh biases (creating Swish-T variants) or adapting its behavior based on input context (Adaptive Swish, ASH). These improvements are being explored across various deep learning tasks, including image classification, continuous control in robotics, and continual learning, demonstrating their potential to boost accuracy and training speed in diverse applications.

Papers