Captioning Metric
Captioning metrics evaluate the quality of automatically generated image or video descriptions, aiming to align automated assessments with human judgment. Recent research focuses on developing reference-free metrics, leveraging large multimodal models like CLIP to compare generated captions directly with their corresponding visual content, often incorporating hierarchical or fine-grained comparisons to improve accuracy and interpretability. These advancements address limitations of traditional reference-based metrics, which rely on scarce human-annotated data and may not capture the nuances of modern, highly detailed captioning models, ultimately improving the evaluation and development of image and video captioning systems.
Papers
July 26, 2024
June 10, 2024
May 29, 2024
May 19, 2024
May 2, 2024
July 31, 2023
November 18, 2022
July 25, 2022
July 4, 2022
June 24, 2022