Image Language
Image-language research focuses on bridging the gap between visual and linguistic information, aiming to create models that understand and generate descriptions of images and videos. Current efforts concentrate on improving generalization across different data distributions, developing efficient training methods for large models (like CLIP and its variants), and adapting these models for various tasks such as video understanding, robotic control, and even emotional reasoning. This interdisciplinary field is significant for advancing artificial intelligence, enabling applications ranging from improved image retrieval and captioning to more sophisticated human-computer interaction and robotic manipulation.
Papers
October 24, 2024
April 2, 2024
February 15, 2024
November 25, 2023
October 30, 2023
July 27, 2023
June 1, 2023
May 31, 2023
May 11, 2023
September 19, 2022
June 1, 2022