Descriptive Caption
Descriptive captioning, the automated generation of textual descriptions for images or audio, aims to bridge the gap between computer vision and natural language processing. Current research focuses on improving caption detail and cultural awareness, often leveraging large language models and vision-language pre-trained models like BLIP, and exploring diverse data augmentation techniques to enhance model performance, particularly in low-data regimes. These advancements have significant implications for various applications, including news reporting, content generation, and image retrieval systems, by enabling more nuanced and informative descriptions of visual and auditory data.
Papers
March 24, 2024
February 8, 2024
December 8, 2023
November 14, 2023
August 5, 2023
December 11, 2022
July 15, 2022
May 12, 2022
April 27, 2022
April 18, 2022
January 7, 2022
January 4, 2022