LLM Reasoning
Research on Large Language Model (LLM) reasoning focuses on improving the ability of LLMs to perform complex, multi-step reasoning tasks, often by augmenting them with techniques like chain-of-thought prompting, reinforcement learning (RL), and integration with symbolic reasoning methods. Current efforts concentrate on enhancing the accuracy and reliability of LLM reasoning, addressing issues like hallucination and inconsistent performance across different domains and tasks, often through improved credit assignment in RL and the development of novel evaluation metrics. These advancements are significant because reliable LLM reasoning is crucial for building trustworthy AI systems across diverse applications, from robotics and healthcare to scientific discovery and decision support.
Papers
Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning
Amrith Setlur, Chirag Nagpal, Adam Fisch, Xinyang Geng, Jacob Eisenstein, Rishabh Agarwal, Alekh Agarwal, Jonathan Berant, Aviral Kumar
Automatic Curriculum Expert Iteration for Reliable LLM Reasoning
Zirui Zhao, Hanze Dong, Amrita Saha, Caiming Xiong, Doyen Sahoo
Not All LLM Reasoners Are Created Equal
Arian Hosseini, Alessandro Sordoni, Daniel Toyama, Aaron Courville, Rishabh Agarwal
VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment
Amirhossein Kazemnejad, Milad Aghajohari, Eva Portelance, Alessandro Sordoni, Siva Reddy, Aaron Courville, Nicolas Le Roux
AHP-Powered LLM Reasoning for Multi-Criteria Evaluation of Open-Ended Responses
Xiaotian Lu, Jiyi Li, Koh Takeuchi, Hisashi Kashima