AI Agent
AI agents are autonomous systems designed to perceive, reason, and act within an environment to achieve specified goals. Current research emphasizes improving agent capabilities through techniques like self-improvement mechanisms (e.g., recursive self-modification), enhanced search algorithms (e.g., Monte Carlo Tree Search), and the integration of large language models (LLMs) for reasoning and tool use. This field is crucial for advancing AI safety and reliability, particularly in addressing challenges like adversarial attacks and ensuring responsible deployment across diverse applications, from traffic modeling to personalized search engines.
Papers
Nicer Than Humans: How do Large Language Models Behave in the Prisoner's Dilemma?
Nicoló Fontana, Francesco Pierri, Luca Maria Aiello
AgentDojo: A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents
Edoardo Debenedetti, Jie Zhang, Mislav Balunović, Luca Beurer-Kellner, Marc Fischer, Florian Tramèr