Paper ID: 2407.03610

VDMA: Video Question Answering with Dynamically Generated Multi-Agents

Noriyuki Kugo, Tatsuya Ishibashi, Kosuke Ono, Yuji Sato

This technical report provides a detailed description of our approach to the EgoSchema Challenge 2024. The EgoSchema Challenge aims to identify the most appropriate responses to questions regarding a given video clip. In this paper, we propose Video Question Answering with Dynamically Generated Multi-Agents (VDMA). This method is a complementary approach to existing response generation systems by employing a multi-agent system with dynamically generated expert agents. This method aims to provide the most accurate and contextually appropriate responses. This report details the stages of our approach, the tools employed, and the results of our experiments.

Submitted: Jul 4, 2024

Topics

Multi Agent
Multi Agent System
Video Question Answering
Video Question
Sophisticated Agent
EGO4D Challenge

Links

arXiv PDF