Hard Label Attack
Hard-label attacks target machine learning models by manipulating inputs to elicit incorrect predictions, while only having access to the model's final classification (the "hard label"), not its internal probabilities. Current research focuses on developing efficient algorithms, often employing surrogate models or local explainability techniques, to minimize the number of queries needed to successfully generate adversarial examples in this challenging setting. These attacks are significant because they represent a realistic threat to deployed machine learning systems, particularly in scenarios with limited feedback or privacy constraints, and their study informs the development of more robust models.
Papers
November 15, 2024
September 18, 2024
February 2, 2024
December 12, 2023
December 10, 2023
August 1, 2023
August 7, 2022
January 20, 2022