Planning Benchmark
Planning benchmarks evaluate the ability of artificial intelligence models, particularly large language models (LLMs), to generate and execute plans in various domains, from household tasks to autonomous driving. Current research focuses on developing benchmarks that assess both the accuracy and robustness of plans, often incorporating multi-modal inputs (e.g., images and text) and real-world complexities like multi-agent interactions. These benchmarks are crucial for advancing AI planning capabilities and informing the development of more reliable and adaptable autonomous systems across diverse applications.
Papers
October 4, 2024
October 2, 2024
June 15, 2024
June 6, 2024
May 31, 2024
May 15, 2024
April 22, 2024
April 18, 2024
March 12, 2024
March 7, 2024
February 15, 2024
December 13, 2023
November 16, 2023
July 3, 2023
June 26, 2023
June 13, 2023
June 21, 2022