Pre Trained Language Model
Pre-trained language models (PLMs) are large neural networks trained on massive text datasets, aiming to capture the statistical regularities of language for various downstream tasks. Current research focuses on improving PLM efficiency through techniques like parameter-efficient fine-tuning and exploring their application in diverse fields, including scientific text classification, mental health assessment, and financial forecasting, often leveraging architectures like BERT and its variants. The ability of PLMs to effectively process and generate human language has significant implications for numerous scientific disciplines and practical applications, ranging from improved information retrieval to more sophisticated AI assistants.
Papers
XuanYuan 2.0: A Large Chinese Financial Chat Model with Hundreds of Billions Parameters
Xuanyu Zhang, Qing Yang, Dongliang Xu
Enhancing Few-shot NER with Prompt Ordering based Data Augmentation
Huiming Wang, Liying Cheng, Wenxuan Zhang, De Wen Soh, Lidong Bing
Prompting with Pseudo-Code Instructions
Mayank Mishra, Prince Kumar, Riyaz Bhat, Rudra Murthy, Danish Contractor, Srikanth Tamilselvam
Constructing Word-Context-Coupled Space Aligned with Associative Knowledge Relations for Interpretable Language Modeling
Fanyu Wang, Zhenping Xie
Recouple Event Field via Probabilistic Bias for Event Extraction
Xingyu Bai, Taiqiang Wu, Han Guo, Zhe Zhao, Xuefeng Yang, Jiayi Li, Weijie Liu, Qi Ju, Weigang Guo, Yujiu Yang
Zero-Shot Text Classification via Self-Supervised Tuning
Chaoqun Liu, Wenxuan Zhang, Guizhen Chen, Xiaobao Wu, Anh Tuan Luu, Chip Hong Chang, Lidong Bing
SimOAP: Improve Coherence and Consistency in Persona-based Dialogue Generation via Over-sampling and Post-evaluation
Junkai Zhou, Liang Pang, Huawei Shen, Xueqi Cheng
Ahead-of-Time P-Tuning
Daniil Gavrilov, Nikita Balagansky
Ditto: A Simple and Efficient Approach to Improve Sentence Embeddings
Qian Chen, Wen Wang, Qinglin Zhang, Siqi Zheng, Chong Deng, Hai Yu, Jiaqing Liu, Yukun Ma, Chong Zhang
UOR: Universal Backdoor Attacks on Pre-trained Language Models
Wei Du, Peixuan Li, Boqun Li, Haodong Zhao, Gongshen Liu
Enhancing Keyphrase Extraction from Long Scientific Documents using Graph Embeddings
Roberto Martínez-Cruz, Debanjan Mahata, Alvaro J. López-López, José Portela
Retentive or Forgetful? Diving into the Knowledge Memorizing Mechanism of Language Models
Boxi Cao, Qiaoyu Tang, Hongyu Lin, Shanshan Jiang, Bin Dong, Xianpei Han, Jiawei Chen, Tianshu Wang, Le Sun
Pre-Training to Learn in Context
Yuxian Gu, Li Dong, Furu Wei, Minlie Huang
Knowledge Rumination for Pre-trained Language Models
Yunzhi Yao, Peng Wang, Shengyu Mao, Chuanqi Tan, Fei Huang, Huajun Chen, Ningyu Zhang
Sensitivity and Robustness of Large Language Models to Prompt Template in Japanese Text Classification Tasks
Chengguang Gan, Tatsunori Mori
Recyclable Tuning for Continual Pre-training
Yujia Qin, Cheng Qian, Xu Han, Yankai Lin, Huadong Wang, Ruobing Xie, Zhiyuan Liu, Maosong Sun, Jie Zhou