Visual Reasoning Ability
Visual reasoning research aims to understand and replicate the human ability to draw inferences and solve problems using visual information. Current efforts focus on developing and evaluating multimodal models, particularly those integrating large language models (LLMs) with vision-language models (VLMs), often employing techniques like chain-of-thought prompting and multi-modal in-context learning to improve reasoning performance. This research is crucial for advancing artificial intelligence, with implications for applications ranging from medical image analysis and robotics to more general-purpose AI systems capable of complex problem-solving.
Papers
October 24, 2024
October 23, 2024
September 21, 2024
September 19, 2024
August 27, 2024
July 1, 2024
June 11, 2024
October 23, 2023
September 8, 2023
March 21, 2023
October 1, 2022
May 6, 2022