Trojan Attack
Trojan attacks involve the malicious insertion of hidden functionalities into machine learning models or hardware circuits, causing unintended behavior triggered by specific inputs. Current research focuses on detecting and mitigating these attacks across various domains, including deep neural networks, large language models, and analog/mixed-signal circuits, employing techniques like large language models (LLMs), adversarial learning, and analysis of attention mechanisms or network sparsity. The significance of this research lies in securing increasingly prevalent AI systems and hardware components, safeguarding against potentially catastrophic consequences in safety-critical applications.
Papers
September 5, 2022
August 9, 2022
August 8, 2022
July 27, 2022
July 8, 2022
May 26, 2022
May 24, 2022
May 13, 2022
February 24, 2022
February 23, 2022
February 15, 2022