Paper ID: 2308.11295

Uncertainty Estimation of Transformers' Predictions via Topological Analysis of the Attention Matrices

Elizaveta Kostenok, Daniil Cherniavskii, Alexey Zaytsev

Transformer-based language models have set new benchmarks across a wide range of NLP tasks, yet reliably estimating the uncertainty of their predictions remains a significant challenge. Existing uncertainty estimation (UE) techniques often fall short in classification tasks, either offering minimal improvements over basic heuristics or relying on costly ensemble models. Moreover, attempts to leverage common embeddings for UE in linear probing scenarios have yielded only modest gains, indicating that alternative model components should be explored. We tackle these limitations by harnessing the geometry of attention maps across multiple heads and layers to assess model confidence. Our approach extracts topological features from attention matrices, providing a low-dimensional, interpretable representation of the model's internal dynamics. Additionally, we introduce topological features to compare attention patterns across heads and layers. Our method significantly outperforms existing UE techniques on benchmarks for acceptability judgments and artificial text detection, offering a more efficient and interpretable solution for uncertainty estimation in large-scale language models.

Submitted: Aug 22, 2023