Visual Gap
The "visual gap" refers to the performance discrepancies between computer vision systems and human perception, particularly when dealing with domain shifts, unseen data, or complex reasoning tasks. Current research focuses on bridging this gap through techniques like contrastive learning, leveraging large language models (LLMs) to process visual information in a more semantically rich way, and developing methods to mitigate hallucinations and biases in vision-language models. Addressing the visual gap is crucial for improving the robustness and reliability of AI systems in real-world applications, ranging from object detection and visual question answering to more complex tasks like visual navigation and abstract visual reasoning.
Papers
November 13, 2024
October 7, 2024
June 14, 2024
May 24, 2024
January 6, 2024
November 20, 2023
September 4, 2023
October 27, 2022
September 10, 2022