Multimodal Planning

Multimodal planning focuses on enabling robots to perform complex tasks by integrating information from multiple sources, such as vision, language, and sensor data, to generate robust and adaptable plans. Current research emphasizes the development of hybrid planning algorithms (e.g., combining A* with Model Predictive Control), end-to-end learning approaches using teacher-student models and knowledge distillation, and the integration of large language models with symbolic planners for improved interpretability and human-robot interaction. These advancements are significantly improving robot capabilities in diverse applications, including autonomous navigation, manipulation, and human-robot collaboration, by enabling more flexible and intelligent decision-making in complex and uncertain environments.

Papers