LLM Based Agent

LLM-based agents are software programs that leverage large language models (LLMs) to perform complex tasks autonomously, often interacting with external tools and environments. Current research emphasizes improving agent safety and reliability through techniques like memory management, error correction, and the development of unified frameworks for agent design and evaluation, including benchmarks for assessing performance across diverse tasks and environments. This field is significant because it pushes the boundaries of AI capabilities, enabling applications in diverse areas such as social simulation, software engineering, and healthcare, while also raising important questions about AI safety and security.

Papers

October 3, 2024

Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents
Hanrong Zhang, Jingyuan Huang, Kai Mei, Yifei Yao, Zhenting Wang, Chenlu Zhan, Hongwei Wang, Yongfeng Zhang
LLM Based Agent Prompt Injection Attack Safe Agent Benchmark Attack

September 30, 2024

MemSim: A Bayesian Simulator for Evaluating Memory of LLM-based Personal Assistants
Zeyu Zhang, Quanyu Dai, Luyu Chen, Zeren Jiang, Rui Li, Jieming Zhu, Xu Chen, Yi Xie, Zhenhua Dong, Ji-Rong Wen
Bayesian Network LLM Based Agent Simulation Based Inference Intelligent Assistant Memory Mechanism Fading Memory Meme Datasets

September 25, 2024

Turn Every Application into an Agent: Towards Efficient Human-Agent-Computer Interaction with API-First LLM-Based Agents
Junting Lu, Zhiyang Zhang, Fangkai Yang, Jue Zhang, Lu Wang, Chao Du, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang, Qi Zhang
Application Proficiency Agent Smith LLM Based Agent Agent Framework Iterative Interaction

September 23, 2024

Towards a Realistic Long-Term Benchmark for Open-Web Research Agents
Peter Mühlbacher, Nikos I. Bosse, Lawrence Phillips
New Benchmark LLM Based LLM Agent LLM Based Agent Web Agent

September 20, 2024

ControlMath: Controllable Data Generation Promotes Math Generalist Models
Nuo Chen, Ning Wu, Jianhui Chang, Jia Li
Large Language Model LLM Based Agent Diverse Equation

September 19, 2024

Strategic Collusion of LLM Agents: Market Division in Multi-Commodity Competitions
Ryan Y. Lin, Siddhartha Ojha, Kevin Cai, Maxwell F. Chen
Autonomous Agent LLM Agent LLM Based Agent Market Analysis Tacit Collusion Numerical Case Study Yield Collusion Cournot Game

September 18, 2024

September 17, 2024

LLM-Agent-UMF: LLM-based Agent Unified Modeling Framework for Seamless Integration of Multi Active/Passive Core-Agents
Amine Ben Hassouna, Hana Chaari, Ines Belhaj
LLM Agent LLM Based Agent Unified Model Seamless Integration Simple Agent

September 13, 2024

September 6, 2024

Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers
Chenglei Si, Diyi Yang, Tatsunori Hashimoto
Natural Language Processing LLM Based Agent LLM Generated

September 4, 2024

August 28, 2024

EPO: Hierarchical LLM Agents with Environment Preference Optimization
Qi Zhao, Haotian Fu, Chen Sun, George Konidaris
LLM Based Agent Preference Optimization Action Generation Direct Preference Long Horizon Decision Making

August 27, 2024

AgentMonitor: A Plug-and-Play Framework for Predictive and Secure Multi-Agent Systems
Chi-Min Chan, Jianxuan Yu, Weize Chen, Chunyang Jiang, Xinyu Liu, Weijie Shi, Zhiyuan Liu, Wei Xue, Yike Guo
Large Language Model Multi Agent System LLM Based Agent Plug and Play Malicious Agent

August 22, 2024

Can LLMs Understand Social Norms in Autonomous Driving Games?
Boxuan Wang, Haonan Duan, Yanhao Feng, Xu Chen, Yongjie Fu, Zhaobin Mo, Xuan Di
LLM Based Agent Cooperative Behavior Social Norm Autonomous Driving Simulation LLM Behavior

August 20, 2024

Athena: Safe Autonomous Agents with Verbal Contrastive Learning
Tanmana Sadhu, Ali Pesaranghader, Yanan Chen, Dong Hoon Yi
Contrastive Learning Autonomous Agent Reasoning Ability LLM Based Agent Language Agent

August 12, 2024

Can We Rely on LLM Agents to Draft Long-Horizon Plans? Let's Take TravelPlanner as an Example
Yanan Chen, Ali Pesaranghader, Tanmana Sadhu, Dong Hoon Yi
Large Language Model Fine Tuning LLM Agent Supervised Fine Tuning LLM Based Agent Single Example Long Horizon Planning Real World Planning Travel Planning

August 8, 2024

Perceive, Reflect, and Plan: Designing LLM Agent for Goal-Directed City Navigation without Instructions
Qingbin Zeng, Qinglong Yang, Shunan Dong, Heming Du, Liang Zheng, Fengli Xu, Yong Li
AI Agent LLM Based Agent High Level Plan Long Horizon Navigation Navigation Skill Emergent Reasoning Navigation Scenario

LLM Based Agent

Papers

Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents

MemSim: A Bayesian Simulator for Evaluating Memory of LLM-based Personal Assistants

Turn Every Application into an Agent: Towards Efficient Human-Agent-Computer Interaction with API-First LLM-Based Agents

Towards a Realistic Long-Term Benchmark for Open-Web Research Agents

ControlMath: Controllable Data Generation Promotes Math Generalist Models

Strategic Collusion of LLM Agents: Market Division in Multi-Commodity Competitions

RAG-Modulo: Solving Sequential Tasks using Experience, Critics, and Language Models

The Impact of Element Ordering on LM Agent Performance

LLM-Agent-UMF: LLM-based Agent Unified Modeling Framework for Seamless Integration of Multi Active/Passive Core-Agents

Agents in Software Engineering: Survey, Landscape, and Vision

AI-LieDar: Examine the Trade-off Between Utility and Truthfulness in LLM Agents

Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers

Large Language Model-Based Agents for Software Engineering: A Survey

Prompt Baking

EPO: Hierarchical LLM Agents with Environment Preference Optimization

AgentMonitor: A Plug-and-Play Framework for Predictive and Secure Multi-Agent Systems

Can LLMs Understand Social Norms in Autonomous Driving Games?

Athena: Safe Autonomous Agents with Verbal Contrastive Learning

Can We Rely on LLM Agents to Draft Long-Horizon Plans? Let's Take TravelPlanner as an Example

Perceive, Reflect, and Plan: Designing LLM Agent for Goal-Directed City Navigation without Instructions