Open Vocabulary Mobile Manipulation
Open-vocabulary mobile manipulation (OVMM) focuses on enabling robots to autonomously grasp and place unseen objects in unstructured environments, a crucial step towards truly versatile home robots. Current research emphasizes robust perception and planning, often integrating large language models (LLMs) and vision-language models (VLMs) for instruction understanding and task execution, alongside advanced 3D scene representation and closed-loop control for improved adaptability and error recovery. Significant progress is being made, though challenges remain in achieving high success rates in real-world settings, highlighting the need for improved generalization and sim-to-real transfer. This research area is driving advancements in embodied AI, with potential for significant impact on assistive robotics and home automation.