Visual Dialogue
Visual dialogue research focuses on building systems that can understand and generate meaningful conversations grounded in visual information, such as images or videos. Current efforts concentrate on improving the integration of large language models with vision-language models to better interpret complex dialogues and generate relevant visual responses, often employing novel evaluation metrics and synthetic datasets to address limitations in existing data. This field is significant because it advances the development of more natural and informative human-computer interaction, with applications ranging from improved image retrieval systems to more sophisticated chatbots capable of understanding multimodal contexts.
Papers
November 13, 2024
July 4, 2024
June 5, 2024
December 21, 2023
August 20, 2023
August 1, 2023
January 14, 2023