Explanation Guided
Explanation-guided attacks exploit the insights provided by explainable AI (XAI) methods to compromise the security and privacy of machine learning models. Current research focuses on understanding how various explanation techniques, including feature attribution methods, reveal vulnerabilities to membership inference attacks and model extraction, often leveraging game-theoretic frameworks or differential privacy to analyze these interactions. This research is crucial for developing robust and trustworthy AI systems, as it highlights the need for defenses against attacks that leverage explanations to compromise model integrity and user data. The ultimate goal is to create AI systems where explanations enhance transparency without sacrificing security or privacy.