Historical Language

Historical language research focuses on applying and adapting natural language processing (NLP) techniques to analyze texts from past eras, aiming to overcome challenges posed by linguistic variation, limited annotated data, and the absence of standardized writing systems for some languages. Current research emphasizes adapting existing large language models (LLMs) like XLM-RoBERTa and GPT-2, often through techniques like parameter-efficient fine-tuning and adapter training, to handle the unique characteristics of historical corpora. This work is significant for its potential to unlock vast historical archives for analysis, enabling new insights into past societies and cultures while also advancing the development of more robust and adaptable NLP models.

Papers