Pre Trained Language Model
Pre-trained language models (PLMs) are large neural networks trained on massive text datasets, aiming to capture the statistical regularities of language for various downstream tasks. Current research focuses on improving PLM efficiency through techniques like parameter-efficient fine-tuning and exploring their application in diverse fields, including scientific text classification, mental health assessment, and financial forecasting, often leveraging architectures like BERT and its variants. The ability of PLMs to effectively process and generate human language has significant implications for numerous scientific disciplines and practical applications, ranging from improved information retrieval to more sophisticated AI assistants.
Papers
DrBERT: A Robust Pre-trained Model in French for Biomedical and Clinical domains
Yanis Labrak, Adrien Bazoge, Richard Dufour, Mickael Rouvier, Emmanuel Morin, Béatrice Daille, Pierre-Antoine Gourraud
MiniRBT: A Two-stage Distilled Small Chinese Pre-trained Model
Xin Yao, Ziqing Yang, Yiming Cui, Shijin Wang
Typhoon: Towards an Effective Task-Specific Masking Strategy for Pre-trained Language Models
Muhammed Shahir Abdurrahman, Hashem Elezabi, Bruce Changlong Xu
TextMI: Textualize Multimodal Information for Integrating Non-verbal Cues in Pre-trained Language Models
Md Kamrul Hasan, Md Saiful Islam, Sangwu Lee, Wasifur Rahman, Iftekhar Naim, Mohammed Ibrahim Khan, Ehsan Hoque