Procedural Activity

Procedural activity research focuses on understanding and modeling how humans perform multi-step tasks, aiming to build AI systems that can assist or even autonomously execute such activities. Current research emphasizes learning representations of procedural activities from various data modalities (e.g., video, text, sensor data) using techniques like task graphs, state machines, and large language models, often incorporating multimodal learning and addressing challenges like error detection and cross-view understanding. This field is significant for advancing AI capabilities in areas such as robotics, human-computer interaction, and medical training, enabling more efficient and robust automation and improved human-AI collaboration.

Papers