Paper ID: 2502.02393 • Published Feb 4, 2025
Lower Bounds for Chain-of-Thought Reasoning in Hard-Attention Transformers
Alireza Amiri, Xinting Huang, Mark Rofin, Michael Hahn
TL;DR
Get AI-generated summaries with premium
Get AI-generated summaries with premium
Chain-of-thought reasoning and scratchpads have emerged as critical tools for
enhancing the computational capabilities of transformers. While theoretical
results show that polynomial-length scratchpads can extend transformers'
expressivity from TC0 to PTIME, their required length remains poorly
understood. Empirical evidence even suggests that transformers need scratchpads
even for many problems in TC0, such as Parity or Multiplication, challenging
optimistic bounds derived from circuit complexity. In this work, we initiate
the study of systematic lower bounds for the number of CoT steps across
different algorithmic problems, in the hard-attention regime. We study a
variety of algorithmic problems, and provide bounds that are tight up to
logarithmic factors. Overall, these results contribute to emerging
understanding of the power and limitations of chain-of-thought reasoning.