Pre Trained Language Model
Pre-trained language models (PLMs) are large neural networks trained on massive text datasets, aiming to capture the statistical regularities of language for various downstream tasks. Current research focuses on improving PLM efficiency through techniques like parameter-efficient fine-tuning and exploring their application in diverse fields, including scientific text classification, mental health assessment, and financial forecasting, often leveraging architectures like BERT and its variants. The ability of PLMs to effectively process and generate human language has significant implications for numerous scientific disciplines and practical applications, ranging from improved information retrieval to more sophisticated AI assistants.
Papers
In-context Learning Distillation: Transferring Few-shot Learning Ability of Pre-trained Language Models
Yukun Huang, Yanda Chen, Zhou Yu, Kathleen McKeown
Smooth Sailing: Improving Active Learning for Pre-trained Language Models with Representation Smoothness Analysis
Josip Jukić, Jan Šnajder
Pre-trained Language Models for Keyphrase Generation: A Thorough Empirical Study
Di Wu, Wasi Uddin Ahmad, Kai-Wei Chang
GanLM: Encoder-Decoder Pre-training with an Auxiliary Discriminator
Jian Yang, Shuming Ma, Li Dong, Shaohan Huang, Haoyang Huang, Yuwei Yin, Dongdong Zhang, Liqun Yang, Furu Wei, Zhoujun Li
Do language models have coherent mental models of everyday things?
Yuling Gu, Bhavana Dalvi Mishra, Peter Clark
When Federated Learning Meets Pre-trained Language Models' Parameter-Efficient Tuning Methods
Zhuo Zhang, Yuanhang Yang, Yong Dai, Lizhen Qu, Zenglin Xu