Lexical Sensitivity
Lexical sensitivity in language models (LMs) refers to the disproportionate impact of seemingly minor word changes on model performance, even when those changes are semantically similar or imperceptible to humans. Current research focuses on understanding this sensitivity within transformer architectures like BERT and GPT, investigating its origins in attention mechanisms and the role of the softmax function, and developing methods to mitigate its effects, such as combinatorial optimization techniques for prompt engineering. This research is crucial for improving the robustness and reliability of LMs, particularly in applications requiring consistent and accurate predictions across variations in input phrasing, and for gaining a deeper understanding of how these models process and represent meaning.