AI Agent

AI agents are autonomous systems designed to perceive, reason, and act within an environment to achieve specified goals. Current research emphasizes improving agent capabilities through techniques like self-improvement mechanisms (e.g., recursive self-modification), enhanced search algorithms (e.g., Monte Carlo Tree Search), and the integration of large language models (LLMs) for reasoning and tool use. This field is crucial for advancing AI safety and reliability, particularly in addressing challenges like adversarial attacks and ensuring responsible deployment across diverse applications, from traffic modeling to personalized search engines.

Papers

July 9, 2024

Richelieu: Self-Evolving LLM-Based Agents for AI Diplomacy
Zhenyu Guan, Xiangyu Kong, Fangwei Zhong, Yizhou Wang
AI Agent LLM Based LLM Agent Multi Agent Challenge Goal Directed Behavior

July 3, 2024

AMEX: Android Multi-annotation Expo Dataset for Mobile GUI Agents
Yuxiang Chai, Siyuan Huang, Yazhe Niu, Han Xiao, Liang Liu, Dingyu Zhang, Peng Gao, Shuai Ren, Hongsheng Li
AI Agent Mobile User

July 1, 2024

AI Agents That Matter
Sayash Kapoor, Benedikt Stroebl, Zachary S. Siegel, Nitya Nadgir, Arvind Narayanan
State of the Art AI Agent Agent Benchmark

June 29, 2024

BioKGBench: A Knowledge Graph Checking Benchmark of AI Agent for Biomedical Science
Xinna Lin, Siqi Ma, Junjie Shan, Xiaojing Zhang, Shell Xu Hu, Tiannan Guo, Stan Z. Li, Kaicheng Yu
Knowledge Graph AI Agent Biomedical Research AI Researcher Agent Capability Large Scale Knowledge Graph

June 28, 2024

ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning
Christopher E. Mower, Yuhui Wan, Hongzhan Yu, Antoine Grosnit, Jonas Gonzalez-Billandon, Matthieu Zimmer, Jinlong Wang, Xinyu Zhang, Yao Zhao, Anbang Zhai, Puze Liu, Daniel Palenicek, Davide Tateo, Cesar Cadena, Marco Hutter, Jan Peters, Guangjian Tian, Yuzheng Zhuang, Kun Shao, Xingyue Quan, Jianye Hao, Jun Wang, Haitham Bou-Ammar
Artificial Intelligence AI Agent Robot Operating System Natural Language Prompt Robot Action Intuitive Robot Structured Reasoning

June 26, 2024

Octo-planner: On-device Language Model for Planner-Action Agents
Wei Chen, Zhiyuan Li, Zhen Guo, Yikang Shen
Language Model AI Agent Action Feature Planning Domain Agent Planning

June 19, 2024

June 17, 2024

MASAI: Modular Architecture for Software-engineering AI Agents
Daman Arora, Atharv Sonwane, Nalin Wadhwa, Abhav Mehrotra, Saiteja Utpala, Ramakrishna Bairi, Aditya Kanade, Nagarajan Natarajan
Artificial Intelligence AI Agent LLM Agent Software Engineering Agent Based Modular Architecture Problem Solving Strategy

June 16, 2024

Understanding Understanding: A Pragmatic Framework Motivated by Large Language Models
Kevin Leyton-Brown, Yoav Shoham
Line by Line Explanation Human Understanding AI Agent Confidence Bound Turing Test Pragmatic Speaker

June 12, 2024

Security of AI Agents
Yifeng He, Ethan Wang, Yuyang Rong, Zifei Cheng, Hao Chen
AI Agent Security Related Security Vulnerability Intelligent Assistant Potential Vulnerability

June 11, 2024

RogueGPT: dis-ethical tuning transforms ChatGPT4 into a Rogue AI in 158 Words
Alessio Buscemi, Daniele Proverbio
Generative Artificial Intelligence AI Agent Word List ChatGPT 4 Normative System Machine Ethic

June 10, 2024

On the Utility of Accounting for Human Beliefs about AI Behavior in Human-AI Collaboration
Guanghui Yu, Robert Kasumba, Chien-Ju Ho, William Yeoh
AI Agent Human Ai Collaboration Task Utility Belief State

June 9, 2024

Deception Analysis with Artificial Intelligence: An Interdisciplinary Perspective
Stefan Sarkadi
Artificial Intelligence Multi Agent System AI Agent Interdisciplinary Perspective AI Deception Deception Cue

June 6, 2024

June 4, 2024

Predicting AI Agent Behavior through Approximation of the Perron-Frobenius Operator
Shiqi Zhang, Darshan Gadginmath, Fabio Pasqualetti
Generative Model AI Agent Nonlinear Dynamic Average Approximation Artificial Intelligence Algorithm Non Linear Operator

May 31, 2024

ADESSE: Advice Explanations in Complex Repeated Decision-Making Environments
Sören Schleibaum, Lu Feng, Sarit Kraus, Jörg P. Müller
Decision Making AI Agent Human Centered Complexity Level Human Decision Maker Broad Persistent Advice

May 30, 2024

Analyzing Human Questioning Behavior and Causal Curiosity through Natural Queries
Roberto Ceraolo, Dmitrii Kharlapenko, Ahmad Khan, Amélie Reymond, Rada Mihalcea, Bernhard Schölkopf, Mrinmaya Sachan, Zhijing Jin
AI Agent Causal Pattern Causal Query

May 28, 2024

Approximating Human Models During Argumentation-based Dialogues
Yinxu Tang, Stylianos Loukas Vasileiou, William Yeoh
AI Agent Human Ai Collaboration Human Model Explainable AI Planning Human Belief

AI Agent

Papers

Richelieu: Self-Evolving LLM-Based Agents for AI Diplomacy

AMEX: Android Multi-annotation Expo Dataset for Mobile GUI Agents

AI Agents That Matter

BioKGBench: A Knowledge Graph Checking Benchmark of AI Agent for Biomedical Science

ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning

Octo-planner: On-device Language Model for Planner-Action Agents

Nicer Than Humans: How do Large Language Models Behave in the Prisoner's Dilemma?

AgentDojo: A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents

MASAI: Modular Architecture for Software-engineering AI Agents

Understanding Understanding: A Pragmatic Framework Motivated by Large Language Models

Security of AI Agents

RogueGPT: dis-ethical tuning transforms ChatGPT4 into a Rogue AI in 158 Words

On the Utility of Accounting for Human Beliefs about AI Behavior in Human-AI Collaboration

Deception Analysis with Artificial Intelligence: An Interdisciplinary Perspective

Quantifying Misalignment Between Agents: Towards a Sociotechnical Understanding of Alignment

Securing Equal Share: A Principled Approach for Learning Multiplayer Symmetric Games

Predicting AI Agent Behavior through Approximation of the Perron-Frobenius Operator

ADESSE: Advice Explanations in Complex Repeated Decision-Making Environments

Analyzing Human Questioning Behavior and Causal Curiosity through Natural Queries

Approximating Human Models During Argumentation-based Dialogues