Reasoning Performance
Reasoning performance in large language models (LLMs) is a central research area aiming to enhance their ability to solve complex, multi-step problems. Current efforts focus on improving reasoning through techniques like chain-of-thought prompting, incorporating diverse perspectives, and leveraging preference models and verifiers to refine reasoning paths and filter out errors. These advancements are crucial for building more reliable and robust AI systems, with implications for various fields including education, healthcare, and autonomous driving, where accurate and dependable reasoning is paramount.
Papers
Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling
Hritik Bansal, Arian Hosseini, Rishabh Agarwal, Vinh Q. Tran, Mehran Kazemi
Critic-CoT: Boosting the reasoning abilities of large language model via Chain-of-thoughts Critic
Xin Zheng, Jie Lou, Boxi Cao, Xueru Wen, Yuqiu Ji, Hongyu Lin, Yaojie Lu, Xianpei Han, Debing Zhang, Le Sun
Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems
Tian Ye, Zicheng Xu, Yuanzhi Li, Zeyuan Allen-Zhu