Romanian Natural Language

Research on Romanian natural language processing (NLP) is rapidly expanding, driven by the need to develop robust language models for a language with limited readily available resources compared to English. Current efforts focus on creating large Romanian language corpora, training powerful large language models (LLMs) using architectures like BERT and Llama, and adapting existing multilingual models through techniques such as QLoRA and cross-lingual domain adaptation. This work is crucial for advancing NLP capabilities in Romanian, enabling improvements in applications such as machine translation, question answering, and speech recognition, and providing valuable resources for the broader NLP research community.

Papers