Visual Planner
Visual planners are algorithms designed to generate sequences of actions to achieve a specified goal, often in complex environments requiring interaction with the physical world or other agents. Current research focuses on integrating large language models (LLMs) with traditional planning methods to improve plan generation, particularly for tasks involving natural language instructions and diverse, dynamic environments. This work spans robotics (dexterous manipulation, autonomous navigation, and quadrotor control), travel planning, and even applications in healthcare (e.g., modeling disease spread) and e-commerce (assortment planning), highlighting the broad applicability and growing importance of efficient and robust visual planning techniques.
Papers
Should We Learn Contact-Rich Manipulation Policies from Sampling-Based Planners?
Huaijiang Zhu, Tong Zhao, Xinpei Ni, Jiuguang Wang, Kuan Fang, Ludovic Righetti, Tao Pang
UFO: Enhancing Diffusion-Based Video Generation with a Uniform Frame Organizer
Delong Liu, Zhaohui Hou, Mingjie Zhan, Shihao Han, Zhicheng Zhao, Fei Su