Question Answer Pair
Question-answer pairs (QAPs) are fundamental to evaluating and improving various AI models, particularly large language models (LLMs), across diverse domains like commonsense reasoning, finance, and scientific literature. Current research focuses on developing robust QAP datasets reflecting real-world complexities, including multimodal data (images, charts) and nuanced language, and on employing techniques like retrieval-augmented generation (RAG) and chain-of-thought prompting to enhance model performance and interpretability. The creation and utilization of high-quality QAPs are crucial for benchmarking progress, identifying model limitations, and ultimately driving the development of more accurate, reliable, and explainable AI systems with broader applications.
Papers
ComplexTempQA: A Large-Scale Dataset for Complex Temporal Question Answering
Raphael Gruber, Abdelrahman Abdallah, Michael Färber, Adam Jatowt
CRAG -- Comprehensive RAG Benchmark
Xiao Yang, Kai Sun, Hao Xin, Yushi Sun, Nikita Bhalla, Xiangsen Chen, Sajal Choudhary, Rongze Daniel Gui, Ziran Will Jiang, Ziyu Jiang, Lingkun Kong, Brian Moran, Jiaqi Wang, Yifan Ethan Xu, An Yan, Chenyu Yang, Eting Yuan, Hanwen Zha, Nan Tang, Lei Chen, Nicolas Scheffer, Yue Liu, Nirav Shah, Rakesh Wanga, Anuj Kumar, Wen-tau Yih, Xin Luna Dong