Visual Commonsense
Visual commonsense research aims to equip computer vision systems with the ability to understand and reason about implicit, everyday knowledge embedded within images, going beyond simple object recognition. Current efforts focus on developing models, often leveraging vision-language transformers and large language models, that can generate descriptive and diverse commonsense inferences, handle counter-intuitive scenarios, and accurately ground individuals within complex scenes. This research is crucial for advancing artificial intelligence, enabling more robust and human-like reasoning in applications such as visual question answering, assistive robotics, and augmented reality.
Papers
October 17, 2024
August 15, 2024
June 19, 2024
May 27, 2024
February 27, 2024
February 21, 2024
January 15, 2024
December 8, 2023
November 11, 2023
October 30, 2023
October 25, 2023
October 9, 2023
July 23, 2023
May 24, 2023
March 13, 2023
December 14, 2022
November 17, 2022
November 10, 2022
October 18, 2022