Catastrophic Overfitting

Catastrophic overfitting (CO) is a significant challenge in fast adversarial training (FAT) of neural networks, where models rapidly lose robustness against stronger attacks despite initially showing improved performance. Current research focuses on understanding the underlying causes of CO, particularly within the context of single-step adversarial training methods applied to convolutional neural networks (CNNs) and vision transformers (ViTs), and developing techniques to mitigate it through loss function modifications, regularization strategies, and adaptive training procedures. Addressing CO is crucial for advancing the development of efficient and robust machine learning models, with implications for improving the security and reliability of AI systems in various applications.

Papers