Input Level Backdoor Detection

Input-level backdoor detection focuses on identifying malicious modifications within machine learning models, specifically those triggered by subtle input patterns inserted during training. Current research emphasizes developing robust detection methods applicable across various model architectures, including convolutional neural networks, diffusion models, and even federated learning systems, often employing techniques like parameter scaling consistency analysis or multi-metric comparisons of model outputs. Effective backdoor detection is crucial for ensuring the security and reliability of machine learning systems deployed in sensitive applications, ranging from image classification to natural language processing, and is a rapidly evolving area of research with significant practical implications.

Papers