Large Pre Trained Language Model
Large pre-trained language models (LLMs) are powerful AI systems trained on massive text datasets, aiming to achieve human-level natural language understanding and generation. Current research focuses on improving efficiency (e.g., through parameter-efficient fine-tuning methods like LoRA and BitFit, and exploring alternative architectures like ModuleFormer), addressing biases and improving robustness (e.g., via data augmentation and techniques to mitigate hallucinations), and adapting LLMs to low-resource languages (e.g., using translation and few-shot learning). These advancements have significant implications for various applications, including dialogue systems, text-to-code generation, and biomedical natural language processing, while also raising important considerations regarding computational cost and ethical implications.
Papers
Smooth Sailing: Improving Active Learning for Pre-trained Language Models with Representation Smoothness Analysis
Josip Jukić, Jan Šnajder
Little Red Riding Hood Goes Around the Globe:Crosslingual Story Planning and Generation with Large Language Models
Evgeniia Razumovskaia, Joshua Maynez, Annie Louis, Mirella Lapata, Shashi Narayan