Reasoning Performance
Reasoning performance in large language models (LLMs) is a central research area aiming to enhance their ability to solve complex, multi-step problems. Current efforts focus on improving reasoning through techniques like chain-of-thought prompting, incorporating diverse perspectives, and leveraging preference models and verifiers to refine reasoning paths and filter out errors. These advancements are crucial for building more reliable and robust AI systems, with implications for various fields including education, healthcare, and autonomous driving, where accurate and dependable reasoning is paramount.
Papers
Which Programming Language and What Features at Pre-training Stage Affect Downstream Logical Inference Performance?
Fumiya Uchiyama, Takeshi Kojima, Andrew Gambardella, Qi Cao, Yusuke Iwasawa, Yutaka Matsuo
Towards Self-Improvement of LLMs via MCTS: Leveraging Stepwise Knowledge with Curriculum Preference Learning
Xiyao Wang, Linfeng Song, Ye Tian, Dian Yu, Baolin Peng, Haitao Mi, Furong Huang, Dong Yu
From Sparse Dependence to Sparse Attention: Unveiling How Chain-of-Thought Enhances Transformer Sample Efficiency
Kaiyue Wen, Huaqing Zhang, Hongzhou Lin, Jingzhao Zhang
Reasoning Paths Optimization: Learning to Reason and Explore From Diverse Paths
Yew Ken Chia, Guizhen Chen, Weiwen Xu, Luu Anh Tuan, Soujanya Poria, Lidong Bing
Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling
Hritik Bansal, Arian Hosseini, Rishabh Agarwal, Vinh Q. Tran, Mehran Kazemi
Critic-CoT: Boosting the reasoning abilities of large language model via Chain-of-thoughts Critic
Xin Zheng, Jie Lou, Boxi Cao, Xueru Wen, Yuqiu Ji, Hongyu Lin, Yaojie Lu, Xianpei Han, Debing Zhang, Le Sun
Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems
Tian Ye, Zicheng Xu, Yuanzhi Li, Zeyuan Allen-Zhu