Hallucination Evaluation
Hallucination evaluation in large language models (LLMs) and vision-language models (VLMs) focuses on developing methods to identify and quantify the generation of inaccurate or fabricated information. Current research emphasizes automated evaluation metrics, often leveraging question-answering or knowledge graph comparisons, to assess factual consistency and faithfulness in model outputs across various modalities (text, images, video) and tasks (summarization, question answering, code generation). These advancements are crucial for improving the reliability and trustworthiness of LLMs and VLMs, particularly in high-stakes applications like healthcare and autonomous systems, where factual accuracy is paramount.
Papers
March 7, 2024
March 1, 2024
February 20, 2024
December 4, 2023
November 2, 2023
October 19, 2023
October 5, 2023
October 1, 2023
September 11, 2023
August 29, 2023
May 17, 2023
April 18, 2023
January 11, 2023
October 14, 2022