Fairness Attack
Fairness attacks target machine learning models, particularly those designed to be fair, aiming to expose or exacerbate biases within them. Current research focuses on developing methods to manipulate model inputs or training data (e.g., through node injection in graph neural networks or poisoning attacks on diffusion models) to deceptively undermine fairness metrics while maintaining model utility. This research highlights the vulnerability of fairness-aware algorithms and underscores the need for robust methods to detect and mitigate such attacks, thereby improving the trustworthiness and reliability of AI systems in sensitive applications.
Papers
October 23, 2024
October 9, 2024
June 5, 2024
February 28, 2024
December 16, 2023
November 12, 2023
October 24, 2023
October 20, 2023
October 18, 2023