Hallucination Evaluation
Hallucination evaluation in large language models (LLMs) and vision-language models (VLMs) focuses on developing methods to identify and quantify the generation of inaccurate or fabricated information. Current research emphasizes automated evaluation metrics, often leveraging question-answering or knowledge graph comparisons, to assess factual consistency and faithfulness in model outputs across various modalities (text, images, video) and tasks (summarization, question answering, code generation). These advancements are crucial for improving the reliability and trustworthiness of LLMs and VLMs, particularly in high-stakes applications like healthcare and autonomous systems, where factual accuracy is paramount.
Papers
December 29, 2024
December 25, 2024
December 7, 2024
November 16, 2024
November 15, 2024
November 14, 2024
October 30, 2024
October 16, 2024
October 13, 2024
September 22, 2024
September 20, 2024
September 19, 2024
September 16, 2024
August 2, 2024
July 17, 2024
July 15, 2024
July 11, 2024
June 14, 2024
June 11, 2024
May 24, 2024