Captioning Model
Image captioning models automatically generate textual descriptions of images or videos, aiming to accurately and comprehensively convey visual content. Current research emphasizes improving caption quality through techniques like reinforcement learning, leveraging large language models (LLMs) for contextualization and improved evaluation, and developing models capable of handling diverse data types such as live video streams and compressed video formats. These advancements have significant implications for various applications, including journalism, retail analytics, accessibility tools for the visually impaired, and enhancing the performance of other AI systems that rely on visual understanding.
Papers
September 22, 2023
August 21, 2023
August 13, 2023
July 31, 2023
July 17, 2023
June 23, 2023
June 17, 2023
June 15, 2023
June 6, 2023
May 26, 2023
May 22, 2023
April 27, 2023
March 23, 2023
February 13, 2023
December 27, 2022
December 20, 2022
December 6, 2022
December 1, 2022