Model Based Shielding
Model-based shielding is a technique used to enhance the safety and reliability of various machine learning models, particularly in high-stakes applications like autonomous driving and reinforcement learning. Current research focuses on developing model-agnostic shielding methods, improving the efficiency of shielding algorithms (e.g., through dynamic or approximate approaches), and applying shielding to address specific vulnerabilities like backdoor attacks in graph neural networks and prompt injection in large language models. These advancements are crucial for deploying machine learning systems in safety-critical domains, ensuring both performance and robustness against unforeseen circumstances or malicious attacks.
Papers
October 14, 2024
October 7, 2024
September 29, 2024
July 5, 2024
June 10, 2024
May 22, 2024
February 1, 2024
July 27, 2023
April 21, 2023
April 13, 2023
March 6, 2023
July 27, 2022
July 18, 2022
December 21, 2021