Instructional Video

Instructional videos are being extensively studied to develop AI systems capable of understanding and interacting with procedural information presented visually. Current research focuses on automatically generating action plans from videos, localizing key steps within videos, and improving the accuracy of video understanding using large language and vision-language models (LLMs and LVLMs), often incorporating techniques like diffusion models and self-training. This work is significant for advancing AI capabilities in areas such as automated assistance, personalized learning, and efficient information retrieval from vast online video repositories.

Papers