Whole Word Masking

Whole word masking (WWM) is a technique in natural language processing that modifies the standard masked language modeling approach by masking entire words instead of individual sub-words or characters. Current research focuses on optimizing WWM for various languages and tasks, exploring its effectiveness across different model architectures like BERT and its variants, and investigating the interplay between WWM and other techniques such as grammar-constrained decoding. This research aims to improve the performance and robustness of language models, particularly in handling complex linguistic structures and diverse dialects, leading to advancements in applications such as machine translation, question answering, and grammatical error correction.

Papers