Language Conditioned
Language-conditioned robotics focuses on enabling robots to understand and execute tasks described in natural language, bridging the gap between human instruction and robotic action. Current research emphasizes developing robust models, often leveraging vision-language models (VLMs) and large language models (LLMs), to translate instructions into executable control sequences, often within reinforcement learning or imitation learning frameworks. This field is significant because it promises to greatly simplify human-robot interaction and enable robots to perform more complex and diverse tasks in unstructured environments, impacting fields like manufacturing, healthcare, and domestic assistance. A key challenge remains improving generalization to unseen tasks and environments.
Papers
Gen2Act: Human Video Generation in Novel Scenarios enables Generalizable Robot Manipulation
Homanga Bharadhwaj, Debidatta Dwibedi, Abhinav Gupta, Shubham Tulsiani, Carl Doersch, Ted Xiao, Dhruv Shah, Fei Xia, Dorsa Sadigh, Sean Kirmani
Bridging Environments and Language with Rendering Functions and Vision-Language Models
Theo Cachet, Christopher R. Dance, Olivier Sigaud