Image Captioning Model
Image captioning models automatically generate textual descriptions of images, aiming to create captions that are both accurate and engaging. Current research focuses on improving caption quality through techniques like direct optimization using CLIP scores, developing more efficient architectures (e.g., those based on Fourier transforms), and enhancing robustness against adversarial attacks. These advancements are significant for various applications, including accessibility tools, content creation, and improving the performance of larger vision-language models, while also raising important considerations around AI safety and ethical deployment.
Papers
October 16, 2024
October 12, 2024
August 26, 2024
August 25, 2024
July 30, 2024
June 30, 2024
May 1, 2024
April 30, 2024
April 23, 2024
April 19, 2024
April 11, 2024
April 3, 2024
March 23, 2024
February 28, 2024
February 21, 2024
February 7, 2024
January 15, 2024
January 9, 2024
November 7, 2023