Test Time Defense
Test-time defense aims to enhance the robustness of machine learning models, particularly deep neural networks, against adversarial attacks and backdoors encountered during inference, without retraining the model. Current research focuses on developing efficient and effective methods, including those leveraging interpretability, spectral projections, and masked autoencoders, to detect and mitigate these threats in a computationally inexpensive manner, even for black-box models like large language models. These advancements are crucial for deploying reliable AI systems in real-world applications where retraining is impractical or impossible, improving the security and trustworthiness of AI. However, recent evaluations highlight the need for rigorous benchmarking and careful consideration of the trade-off between robustness and computational cost.