Hidden Backdoor
Hidden backdoors in machine learning models represent a significant security vulnerability where malicious actors embed triggers into models during training, causing unintended behavior upon activation. Current research focuses on detecting and mitigating these backdoors across various architectures, including large language models, neural radiance fields, and federated learning systems, with a particular emphasis on understanding how different training paradigms (e.g., contrastive learning, reinforcement learning) exacerbate the problem. The widespread adoption of AI in critical applications necessitates robust defenses against these attacks, driving ongoing efforts to develop more secure training methods and detection algorithms.
Papers
November 14, 2024
September 30, 2024
July 16, 2024
July 4, 2024
June 10, 2024
May 30, 2024
September 12, 2023
August 20, 2023
February 1, 2023
December 14, 2022
November 2, 2022
October 21, 2022
October 20, 2022
October 7, 2022
August 19, 2022
July 7, 2022
February 5, 2022