Target Class
Target class manipulation in machine learning models is a significant area of research focusing on how models can be subtly altered to misclassify inputs, often through backdoor attacks or data poisoning. Current research investigates methods to detect and mitigate these attacks, exploring techniques like composite backdoor filtering and generative models for trigger creation, as well as improving model robustness through approaches such as ensembling. Understanding and addressing these vulnerabilities is crucial for ensuring the reliability and security of machine learning systems across various applications, from image recognition to medical diagnosis.
Papers
June 23, 2024
May 9, 2024
March 22, 2024
August 4, 2023
November 22, 2022
November 21, 2022