Pre Trained Multilingual Model
Pre-trained multilingual models are large language models trained on massive datasets spanning numerous languages, aiming to improve cross-lingual understanding and task performance. Current research focuses on mitigating issues like data bias, improving efficiency (e.g., faster text generation, data-efficient fine-tuning), and enhancing zero-shot cross-lingual transfer capabilities, often employing transformer-based architectures like BERT and its variants. These models are significantly impacting various NLP tasks, from machine translation and named entity recognition to question answering and speech-to-text, particularly benefiting low-resource languages by enabling knowledge transfer from high-resource ones.
Papers
December 18, 2021
November 3, 2021