Question Answer Pair
Question-answer pairs (QAPs) are fundamental to evaluating and improving various AI models, particularly large language models (LLMs), across diverse domains like commonsense reasoning, finance, and scientific literature. Current research focuses on developing robust QAP datasets reflecting real-world complexities, including multimodal data (images, charts) and nuanced language, and on employing techniques like retrieval-augmented generation (RAG) and chain-of-thought prompting to enhance model performance and interpretability. The creation and utilization of high-quality QAPs are crucial for benchmarking progress, identifying model limitations, and ultimately driving the development of more accurate, reliable, and explainable AI systems with broader applications.
Papers
Compromesso! Italian Many-Shot Jailbreaks Undermine the Safety of Large Language Models
Fabio Pernisi, Dirk Hovy, Paul Röttger
Open-domain Implicit Format Control for Large Language Model Generation
Yiqun Yao, Wenjia Ma, Xuezhi Fang, Xin Jiang, Xiang Li, Xuying Meng, Peng Han, Jing Li, Aixin Sun, Yequan Wang
ComplexTempQA: A Large-Scale Dataset for Complex Temporal Question Answering
Raphael Gruber, Abdelrahman Abdallah, Michael Färber, Adam Jatowt
CRAG -- Comprehensive RAG Benchmark
Xiao Yang, Kai Sun, Hao Xin, Yushi Sun, Nikita Bhalla, Xiangsen Chen, Sajal Choudhary, Rongze Daniel Gui, Ziran Will Jiang, Ziyu Jiang, Lingkun Kong, Brian Moran, Jiaqi Wang, Yifan Ethan Xu, An Yan, Chenyu Yang, Eting Yuan, Hanwen Zha, Nan Tang, Lei Chen, Nicolas Scheffer, Yue Liu, Nirav Shah, Rakesh Wanga, Anuj Kumar, Wen-tau Yih, Xin Luna Dong