Visual Language Reasoning
Visual language reasoning focuses on enabling computers to understand and reason about information presented in both visual and textual formats, aiming to bridge the gap between human perception and machine intelligence. Current research emphasizes improving the performance of large visual-language models (LVLMs) on complex tasks like multimodal fake news detection and abstract image understanding, often employing techniques like in-context learning and dual-system architectures to enhance reasoning capabilities. These advancements are significant for applications ranging from autonomous driving safety to more effective information processing and analysis across diverse fields, particularly where visual and textual data are intertwined.
Papers
October 12, 2024
July 16, 2024
July 9, 2024
May 23, 2024
October 4, 2023
December 20, 2022
November 3, 2022
October 3, 2022
July 23, 2022