Pre Trained Language Model
Pre-trained language models (PLMs) are large neural networks trained on massive text datasets, aiming to capture the statistical regularities of language for various downstream tasks. Current research focuses on improving PLM efficiency through techniques like parameter-efficient fine-tuning and exploring their application in diverse fields, including scientific text classification, mental health assessment, and financial forecasting, often leveraging architectures like BERT and its variants. The ability of PLMs to effectively process and generate human language has significant implications for numerous scientific disciplines and practical applications, ranging from improved information retrieval to more sophisticated AI assistants.
Papers
Empirical Sufficiency Lower Bounds for Language Modeling with Locally-Bootstrapped Semantic Structures
Jakob Prange, Emmanuele Chersoni
PreQuant: A Task-agnostic Quantization Approach for Pre-trained Language Models
Zhuocheng Gong, Jiahao Liu, Qifan Wang, Yang Yang, Jingang Wang, Wei Wu, Yunsen Xian, Dongyan Zhao, Rui Yan
Generate then Select: Open-ended Visual Question Answering Guided by World Knowledge
Xingyu Fu, Sheng Zhang, Gukyeong Kwon, Pramuditha Perera, Henghui Zhu, Yuhao Zhang, Alexander Hanbo Li, William Yang Wang, Zhiguo Wang, Vittorio Castelli, Patrick Ng, Dan Roth, Bing Xiang
AdapterEM: Pre-trained Language Model Adaptation for Generalized Entity Matching using Adapter-tuning
John Bosco Mugeni, Steven Lynden, Toshiyuki Amagasa, Akiyoshi Matono
Plug-and-Play Knowledge Injection for Pre-trained Language Models
Zhengyan Zhang, Zhiyuan Zeng, Yankai Lin, Huadong Wang, Deming Ye, Chaojun Xiao, Xu Han, Zhiyuan Liu, Peng Li, Maosong Sun, Jie Zhou
One Network, Many Masks: Towards More Parameter-Efficient Transfer Learning
Guangtao Zeng, Peiyuan Zhang, Wei Lu
In-Context Analogical Reasoning with Pre-Trained Language Models
Xiaoyang Hu, Shane Storks, Richard L. Lewis, Joyce Chai
Fine-tuning Happens in Tiny Subspaces: Exploring Intrinsic Task-specific Subspaces of Pre-trained Language Models
Zhong Zhang, Bang Liu, Junming Shao
Modeling Adversarial Attack on Pre-trained Language Models as Sequential Decision Making
Xuanjie Fang, Sijie Cheng, Yang Liu, Wei Wang
Improving Generalization in Language Model-Based Text-to-SQL Semantic Parsing: Two Simple Semantic Boundary-Based Techniques
Daking Rai, Bailin Wang, Yilun Zhou, Ziyu Yao
Large language models improve Alzheimer's disease diagnosis using multi-modality data
Yingjie Feng, Jun Wang, Xianfeng Gu, Xiaoyin Xu, Min Zhang
PromptNER: Prompt Locating and Typing for Named Entity Recognition
Yongliang Shen, Zeqi Tan, Shuhui Wu, Wenqi Zhang, Rongsheng Zhang, Yadong Xi, Weiming Lu, Yueting Zhuang
Learning and Leveraging Verifiers to Improve Planning Capabilities of Pre-trained Language Models
Daman Arora, Subbarao Kambhampati
Learning to Imagine: Visually-Augmented Natural Language Generation
Tianyi Tang, Yushuo Chen, Yifan Du, Junyi Li, Wayne Xin Zhao, Ji-Rong Wen
A Study on Knowledge Distillation from Weak Teacher for Scaling Up Pre-trained Language Models
Hayeon Lee, Rui Hou, Jongpil Kim, Davis Liang, Sung Ju Hwang, Alexander Min
Parameter-Efficient Fine-Tuning without Introducing New Latency
Baohao Liao, Yan Meng, Christof Monz
Neural Architecture Search for Parameter-Efficient Fine-tuning of Large Pre-trained Language Models
Neal Lawton, Anoop Kumar, Govind Thattai, Aram Galstyan, Greg Ver Steeg
Label Agnostic Pre-training for Zero-shot Text Classification
Christopher Clarke, Yuzhao Heng, Yiping Kang, Krisztian Flautner, Lingjia Tang, Jason Mars
Passive learning of active causal strategies in agents and language models
Andrew Kyle Lampinen, Stephanie C Y Chan, Ishita Dasgupta, Andrew J Nam, Jane X Wang