Paper ID: 2311.09830

AutoPlanBench: Automatically generating benchmarks for LLM planners from PDDL

Katharina Stein, Daniel Fišer, Jörg Hoffmann, Alexander Koller

LLMs are being increasingly used for planning-style tasks, but their capabilities for planning and reasoning are poorly understood. We present AutoPlanBench, a novel method for automatically converting planning benchmarks written in PDDL into textual descriptions and offer a benchmark dataset created with our method. We show that while the best LLM planners do well on some planning tasks, others remain out of reach of current methods.

Submitted: Nov 16, 2023

Topics

New Benchmark
Task Planning
Planning Benchmark
LLM Planning

Links

arXiv PDF