Paper ID: 2402.11997

Remember This Event That Year? Assessing Temporal Information and Reasoning in Large Language Models

Himanshu Beniwal, Kowsik Nandagopan D, Mayank Singh

Large Language Models (LLMs) are increasingly becoming ubiquitous, yet their ability to reason about and retain temporal information remains limited. This hinders their application in real-world scenarios where understanding the sequential nature of events is crucial. This paper experiments with state-of-the-art models on a novel, large-scale temporal dataset, \textbf{TempUN}, to reveal significant limitations in temporal retention and reasoning abilities. Interestingly, closed-source models indicate knowledge gaps more frequently, potentially suggesting a trade-off between uncertainty awareness and incorrect responses. Further, exploring various fine-tuning approaches yielded no major performance improvements. The associated dataset and code are available at the following URL (https://github.com/lingoiitgn/TempUN).

Submitted: Feb 19, 2024