Paper ID: 2502.02393 • Published Feb 4, 2025

Lower Bounds for Chain-of-Thought Reasoning in Hard-Attention Transformers

Alireza Amiri, Xinting Huang, Mark Rofin, Michael Hahn
TL;DR
Get AI-generated summaries with premium
Get AI-generated summaries with premium
Chain-of-thought reasoning and scratchpads have emerged as critical tools for enhancing the computational capabilities of transformers. While theoretical results show that polynomial-length scratchpads can extend transformers' expressivity from TC0 to PTIME, their required length remains poorly understood. Empirical evidence even suggests that transformers need scratchpads even for many problems in TC0, such as Parity or Multiplication, challenging optimistic bounds derived from circuit complexity. In this work, we initiate the study of systematic lower bounds for the number of CoT steps across different algorithmic problems, in the hard-attention regime. We study a variety of algorithmic problems, and provide bounds that are tight up to logarithmic factors. Overall, these results contribute to emerging understanding of the power and limitations of chain-of-thought reasoning.