Large Pre Trained Language Model
Large pre-trained language models (LLMs) are powerful AI systems trained on massive text datasets, aiming to achieve human-level natural language understanding and generation. Current research focuses on improving efficiency (e.g., through parameter-efficient fine-tuning methods like LoRA and BitFit, and exploring alternative architectures like ModuleFormer), addressing biases and improving robustness (e.g., via data augmentation and techniques to mitigate hallucinations), and adapting LLMs to low-resource languages (e.g., using translation and few-shot learning). These advancements have significant implications for various applications, including dialogue systems, text-to-code generation, and biomedical natural language processing, while also raising important considerations regarding computational cost and ethical implications.
Papers
Unraveling ChatGPT: A Critical Analysis of AI-Generated Goal-Oriented Dialogues and Annotations
Tiziano Labruna, Sofia Brenna, Andrea Zaninello, Bernardo Magnini
RetICL: Sequential Retrieval of In-Context Examples with Reinforcement Learning
Alexander Scarlatos, Andrew Lan
HumBEL: A Human-in-the-Loop Approach for Evaluating Demographic Factors of Language Models in Human-Machine Conversations
Anthony Sicilia, Jennifer C. Gates, Malihe Alikhani
Regex-augmented Domain Transfer Topic Classification based on a Pre-trained Language Model: An application in Financial Domain
Vanessa Liao, Syed Shariyar Murtaza, Yifan Nie, Jimmy Lin