Malicious Agent
Malicious agents, encompassing compromised robots, AI-powered social engineering tools, and deceptive actors within multi-agent systems, represent a significant threat across various domains. Current research focuses on detecting and mitigating their influence through methods like distributed anomaly detection in robot swarms, robust consensus algorithms that account for differing agent types and trust levels, and machine learning models that identify malicious behavior in text and network data. Understanding and countering these agents is crucial for ensuring the security and reliability of autonomous systems, online platforms, and collaborative AI applications.
Papers
AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents
Maksym Andriushchenko, Alexandra Souly, Mateusz Dziemian, Derek Duenas, Maxwell Lin, Justin Wang, Dan Hendrycks, Andy Zou, Zico Kolter, Matt Fredrikson, Eric Winsor, Jerome Wynne, Yarin Gal, Xander Davies
SoK: Verifiable Cross-Silo FL
Aleksei Korneev (CRIStAL, MAGNET), Jan Ramon (CRIStAL, MAGNET)