Pretrained Language Model
Pretrained language models (PLMs) are large neural networks trained on massive text corpora to learn general language representations, enabling efficient adaptation to various downstream tasks like text classification and question answering. Current research focuses on improving PLM efficiency (e.g., through parameter-efficient fine-tuning and model compression), addressing biases and enhancing robustness, and better understanding their internal knowledge representations and limitations (e.g., through probing techniques and analysis of attention mechanisms). These advancements are significantly impacting numerous fields, from improving information retrieval and machine translation to creating more sophisticated and responsible AI systems.
Papers
Measuring the Knowledge Acquisition-Utilization Gap in Pretrained Language Models
Amirhossein Kazemnejad, Mehdi Rezagholizadeh, Prasanna Parthasarathi, Sarath Chandar
Bi-Drop: Enhancing Fine-tuning Generalization via Synchronous sub-net Estimation and Optimization
Shoujie Tong, Heming Xia, Damai Dai, Runxin Xu, Tianyu Liu, Binghuai Lin, Yunbo Cao, Zhifang Sui
Language-Agnostic Bias Detection in Language Models with Bias Probing
Abdullatif Köksal, Omer Faruk Yalcin, Ahmet Akbiyik, M. Tahir Kilavuz, Anna Korhonen, Hinrich Schütze
Farewell to Aimless Large-scale Pretraining: Influential Subset Selection for Language Model
Xiao Wang, Weikang Zhou, Qi Zhang, Jie Zhou, Songyang Gao, Junzhe Wang, Menghan Zhang, Xiang Gao, Yunwen Chen, Tao Gui
PrOnto: Language Model Evaluations for 859 Languages
Luke Gessler