Monolingual Pre Trained

Monolingual pre-trained language models focus on training large language models on massive amounts of text from a single language, aiming to achieve superior performance within that language compared to multilingual counterparts. Current research emphasizes efficient training methods for low-resource languages, including techniques like model adaptation and cross-lingual transfer learning, often leveraging architectures such as BERT and LLMs like Llama. This approach is significant because it addresses the performance limitations of multilingual models in specific languages and offers a more cost-effective and sustainable path to developing high-quality language models for a wider range of languages.

Papers