Model Extraction Attack
Model extraction attacks aim to steal the functionality of machine learning models by querying their predictions, effectively replicating the model without access to its training data or internal parameters. Current research focuses on developing more efficient attack methods, particularly for large language models and object detectors, often employing techniques like knowledge distillation, active learning, and exploiting counterfactual explanations. This area is crucial for securing machine learning as a service platforms and protecting intellectual property, driving ongoing efforts to develop robust defenses such as watermarking and query unlearning.
Papers
May 13, 2022
February 27, 2022
February 25, 2022
February 17, 2022
January 23, 2022
November 8, 2021