Pre Trained Language Model
Pre-trained language models (PLMs) are large neural networks trained on massive text datasets, aiming to capture the statistical regularities of language for various downstream tasks. Current research focuses on improving PLM efficiency through techniques like parameter-efficient fine-tuning and exploring their application in diverse fields, including scientific text classification, mental health assessment, and financial forecasting, often leveraging architectures like BERT and its variants. The ability of PLMs to effectively process and generate human language has significant implications for numerous scientific disciplines and practical applications, ranging from improved information retrieval to more sophisticated AI assistants.
Papers
Dipping PLMs Sauce: Bridging Structure and Text for Effective Knowledge Graph Completion via Conditional Soft Prompting
Chen Chen, Yufei Wang, Aixin Sun, Bing Li, Kwok-Yan Lam
Prompt Tuning Pushes Farther, Contrastive Learning Pulls Closer: A Two-Stage Approach to Mitigate Social Biases
Yingji Li, Mengnan Du, Xin Wang, Ying Wang
Investigating Pre-trained Language Models on Cross-Domain Datasets, a Step Closer to General AI
Mohamad Ballout, Ulf Krumnack, Gunther Heidemann, Kai-Uwe Kühnberger
Opening the Black Box: Analyzing Attention Weights and Hidden States in Pre-trained Language Models for Non-language Tasks
Mohamad Ballout, Ulf Krumnack, Gunther Heidemann, Kai-Uwe Kühnberger
Towards Accurate Translation via Semantically Appropriate Application of Lexical Constraints
Yujin Baek, Koanho Lee, Dayeon Ki, Hyoung-Gyu Lee, Cheonbok Park, Jaegul Choo
JiuZhang 2.0: A Unified Chinese Pre-trained Language Model for Multi-task Mathematical Problem Solving
Wayne Xin Zhao, Kun Zhou, Beichen Zhang, Zheng Gong, Zhipeng Chen, Yuanhang Zhou, Ji-Rong Wen, Jing Sha, Shijin Wang, Cong Liu, Guoping Hu
Preserving Commonsense Knowledge from Pre-trained Language Models via Causal Inference
Junhao Zheng, Qianli Ma, Shengjie Qiu, Yue Wu, Peitian Ma, Junlong Liu, Huawen Feng, Xichen Shang, Haibin Chen
QUERT: Continual Pre-training of Language Model for Query Understanding in Travel Domain Search
Jian Xie, Yidan Liang, Jingping Liu, Yanghua Xiao, Baohua Wu, Shenghua Ni
Are Intermediate Layers and Labels Really Necessary? A General Language Model Distillation Method
Shicheng Tan, Weng Lam Tam, Yuanchun Wang, Wenwen Gong, Shu Zhao, Peng Zhang, Jie Tang