Reasoning Datasets
Reasoning datasets are collections of problems designed to evaluate and improve the reasoning capabilities of large language models (LLMs). Current research focuses on creating larger, more diverse datasets encompassing various reasoning types (mathematical, commonsense, logical) and incorporating multimodal data (text and images). These datasets, coupled with techniques like chain-of-thought prompting, process supervision, and tool augmentation (e.g., integrating external calculators or search engines), aim to enhance LLMs' ability to solve complex problems. The development of robust reasoning datasets is crucial for advancing LLM capabilities and ensuring their reliable application in diverse fields, including healthcare and education.
Papers
100 instances is all you need: predicting the success of a new LLM on unseen data by testing on a few instances
Lorenzo Pacchiardi, Lucy G. Cheke, José Hernández-Orallo
Strategic Chain-of-Thought: Guiding Accurate Reasoning in LLMs through Strategy Elicitation
Yu Wang, Shiwan Zhao, Zhihu Wang, Heyuan Huang, Ming Fan, Yubo Zhang, Zhixing Wang, Haijun Wang, Ting Liu
Enhancing Healthcare LLM Trust with Atypical Presentations Recalibration
Jeremy Qin, Bang Liu, Quoc Dinh Nguyen