Content Hallucination
Content hallucination, the generation of factually incorrect or inconsistent information by large language and vision-language models (LLMs and LVLMs), is a significant challenge hindering their reliable deployment. Current research focuses on developing methods to detect and mitigate hallucinations, employing techniques such as hierarchical feedback learning, contrastive decoding, retrieval-augmented generation, and prompt engineering across various model architectures. Addressing this issue is crucial for improving the trustworthiness and safety of these powerful models in diverse applications, ranging from medical diagnosis to financial reporting and beyond. The development of robust benchmarks and evaluation protocols is also a key area of ongoing investigation.
Papers
Chain-of-Verification Reduces Hallucination in Large Language Models
Shehzaad Dhuliawala, Mojtaba Komeili, Jing Xu, Roberta Raileanu, Xian Li, Asli Celikyilmaz, Jason Weston
Exploring the Relationship between LLM Hallucinations and Prompt Linguistic Nuances: Readability, Formality, and Concreteness
Vipula Rawte, Prachi Priya, S. M Towhidul Islam Tonmoy, S M Mehedi Zaman, Amit Sheth, Amitava Das
Sources of Hallucination by Large Language Models on Inference Tasks
Nick McKenna, Tianyi Li, Liang Cheng, Mohammad Javad Hosseini, Mark Johnson, Mark Steedman
WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on Wikipedia
Sina J. Semnani, Violet Z. Yao, Heidi C. Zhang, Monica S. Lam
HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models
Junyi Li, Xiaoxue Cheng, Wayne Xin Zhao, Jian-Yun Nie, Ji-Rong Wen
HalOmi: A Manually Annotated Benchmark for Multilingual Hallucination and Omission Detection in Machine Translation
David Dale, Elena Voita, Janice Lam, Prangthip Hansanti, Christophe Ropers, Elahe Kalbassi, Cynthia Gao, Loïc Barrault, Marta R. Costa-jussà