Masked Language
Masked language modeling (MLM) is a self-supervised learning technique for training language models by masking and predicting words in a sentence. Current research focuses on improving MLM's efficiency and effectiveness through novel masking strategies, enhanced model architectures (like incorporating decoders into encoder-only models), and the development of more robust evaluation metrics for assessing biases and performance across diverse tasks and languages. These advancements are significant because they lead to more accurate and less biased language models with broader applications in natural language processing, including machine translation, text generation, and question answering.
Papers
Deriving Language Models from Masked Language Models
Lucas Torroba Hennigen, Yoon Kim
Self-Evolution Learning for Discriminative Language Model Pretraining
Qihuang Zhong, Liang Ding, Juhua Liu, Bo Du, Dacheng Tao
Dynamic Masking Rate Schedules for MLM Pretraining
Zachary Ankner, Naomi Saphra, Davis Blalock, Jonathan Frankle, Matthew L. Leavitt
How does the task complexity of masked pretraining objectives affect downstream performance?
Atsuki Yamaguchi, Hiroaki Ozaki, Terufumi Morishita, Gaku Morio, Yasuhiro Sogawa
ZeroPrompt: Streaming Acoustic Encoders are Zero-Shot Masked LMs
Xingchen Song, Di Wu, Binbin Zhang, Zhendong Peng, Bo Dang, Fuping Pan, Zhiyong Wu