Sign to Gloss
Sign-to-gloss research focuses on automatically translating sign language videos into textual representations of the signs (glosses), a crucial step in building comprehensive sign language translation systems. Current research emphasizes improving the accuracy and efficiency of this translation, employing techniques like cross-modality data augmentation, advanced language models (e.g., BERT), and discrete diffusion models to handle the complexities of sign language's visual and linguistic nature. These advancements address the limitations of scarce training data and the inherent modality gap between video and text, leading to improved performance in sign language processing and ultimately facilitating broader access to communication for the Deaf community.