Backdoor Sample
Backdoor attacks on machine learning models involve subtly poisoning training data to manipulate model outputs, causing the model to misclassify inputs containing a hidden "trigger." Current research focuses on detecting these poisoned samples within various model architectures, including convolutional neural networks (CNNs), graph neural networks (GNNs), and large language models (LLMs), employing techniques like uncertainty analysis, feature space analysis, and graph embedding methods. The ability to reliably detect and mitigate backdoor attacks is crucial for ensuring the trustworthiness and security of machine learning systems across diverse applications, from image recognition to natural language processing. This is a rapidly evolving field with ongoing efforts to develop more robust and generalizable defense mechanisms.
Papers
Unmasking honey adulteration : a breakthrough in quality assurance through cutting-edge convolutional neural network analysis of thermal images
Ilias Boulbarj, Bouklouze Abdelaziz, Yousra El Alami, Douzi Samira, Douzi Hassan
Which Pretrain Samples to Rehearse when Finetuning Pretrained Models?
Andrew Bai, Chih-Kuan Yeh, Cho-Jui Hsieh, Ankur Taly