Threat Model
Threat modeling in artificial intelligence and cybersecurity focuses on identifying and analyzing potential vulnerabilities and attacks against machine learning models and systems. Current research emphasizes various attack vectors, including data poisoning, adversarial examples (e.g., $\ell_p$-bounded attacks), prompt injection, and model inversion attacks, often within specific threat models (e.g., white-box, black-box, malicious server/client). These studies utilize diverse model architectures (e.g., transformers, convolutional neural networks, graph neural networks) and algorithms (e.g., adversarial training, Bayesian methods, reinforcement learning) to evaluate robustness and develop effective defenses. Understanding these threats is crucial for building secure and reliable AI systems, impacting both the development of trustworthy AI and the protection of sensitive data in various applications.