Large Pre Trained Language Model
Large pre-trained language models (LLMs) are powerful AI systems trained on massive text datasets, aiming to achieve human-level natural language understanding and generation. Current research focuses on improving efficiency (e.g., through parameter-efficient fine-tuning methods like LoRA and BitFit, and exploring alternative architectures like ModuleFormer), addressing biases and improving robustness (e.g., via data augmentation and techniques to mitigate hallucinations), and adapting LLMs to low-resource languages (e.g., using translation and few-shot learning). These advancements have significant implications for various applications, including dialogue systems, text-to-code generation, and biomedical natural language processing, while also raising important considerations regarding computational cost and ethical implications.
Papers
Detecting Label Errors by using Pre-Trained Language Models
Derek Chong, Jenny Hong, Christopher D. Manning
Self-Guided Noise-Free Data Generation for Efficient Zero-Shot Learning
Jiahui Gao, Renjie Pi, Yong Lin, Hang Xu, Jiacheng Ye, Zhiyong Wu, Weizhong Zhang, Xiaodan Liang, Zhenguo Li, Lingpeng Kong
Are Large Pre-Trained Language Models Leaking Your Personal Information?
Jie Huang, Hanyin Shao, Kevin Chen-Chuan Chang
AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning
Yaqing Wang, Sahaj Agarwal, Subhabrata Mukherjee, Xiaodong Liu, Jing Gao, Ahmed Hassan Awadallah, Jianfeng Gao
Maieutic Prompting: Logically Consistent Reasoning with Recursive Explanations
Jaehun Jung, Lianhui Qin, Sean Welleck, Faeze Brahman, Chandra Bhagavatula, Ronan Le Bras, Yejin Choi
Different Affordances on Facebook and SMS Text Messaging Do Not Impede Generalization of Language-Based Predictive Models
Tingting Liu, Salvatore Giorgi, Xiangyu Tao, Sharath Chandra Guntuku, Douglas Bellew, Brenda Curtis, Lyle Ungar
Towards Coherent and Consistent Use of Entities in Narrative Generation
Pinelopi Papalampidi, Kris Cao, Tomas Kocisky