Mathematical Reasoning
Mathematical reasoning in large language models (LLMs) is a burgeoning research area focused on evaluating and improving the ability of these models to solve mathematical problems, encompassing both symbolic and numerical reasoning. Current research emphasizes developing more robust benchmarks that assess not only final accuracy but also the reasoning process itself, including error detection and correction, and exploring various training methods such as reinforcement learning from human feedback and instruction tuning to enhance model performance. This field is significant because advancements in mathematical reasoning capabilities in LLMs have broad implications for various applications, including education, scientific discovery, and automated problem-solving.
Papers
FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI
Elliot Glazer, Ege Erdil, Tamay Besiroglu, Diego Chicharro, Evan Chen, Alex Gunning, Caroline Falkman Olsson, Jean-Stanislas Denain, Anson Ho, Emily de Oliveira Santos, Olli Järviniemi, Matthew Barnett, Robert Sandler, Matej Vrzala, Jaime Sevilla, Qiuyu Ren, Elizabeth Pratt, Lionel Levine, Grant Barkley, Natalie Stewart, Bogdan Grechuk, Tetiana Grechuk, Shreepranav Varma Enugandla, Mark Wildon
Kwai-STaR: Transform LLMs into State-Transition Reasoners
Xingyu Lu, Yuhang Hu, Changyi Liu, Tianke Zhang, Zhenyu Yang, Zhixiang Ding, Shengsheng Qian, Meng Du, Ruiwen Kang, Kaiyu Tang, Fan Yang, Tingting Gao, Di Zhang, Hai-Tao Zheng, Bin Wen
Embedding Self-Correction as an Inherent Ability in Large Language Models for Enhanced Mathematical Reasoning
Kuofeng Gao, Huanqia Cai, Qingyao Shuai, Dihong Gong, Zhifeng Li
CoMAT: Chain of Mathematically Annotated Thought Improves Mathematical Reasoning
Joshua Ong Jun Leang, Aryo Pradipta Gema, Shay B. Cohen