Sign Language Translation
Sign language translation (SLT) aims to automatically convert sign language videos into spoken language text, bridging communication gaps between deaf and hearing individuals. Current research heavily utilizes transformer-based neural networks, often incorporating multi-stream approaches to process hand gestures, facial expressions, and body movements simultaneously, and exploring techniques like contrastive learning to improve feature discrimination. This field is significant for its potential to improve accessibility for deaf and hard-of-hearing communities and is driving advancements in multimodal machine learning, particularly in handling continuous, dynamic data streams and addressing data scarcity challenges through techniques like data augmentation and self-supervised learning.
Papers
Signs as Tokens: An Autoregressive Multilingual Sign Language Generator
Ronglai Zuo, Rolandos Alexandros Potamias, Evangelos Ververas, Jiankang Deng, Stefanos Zafeiriou
DiffSLT: Enhancing Diversity in Sign Language Translation via Diffusion Model
JiHwan Moon, Jihoon Park, Jungeun Kim, Jongseong Bae, Hyeongwoo Jeon, Ha Young Kim
Leveraging the Power of MLLMs for Gloss-Free Sign Language Translation
Jungeun Kim, Hyeongwoo Jeon, Jongseong Bae, Ha Young Kim
SHuBERT: Self-Supervised Sign Language Representation Learning via Multi-Stream Cluster Prediction
Shester Gueuwou, Xiaodan Du, Greg Shakhnarovich, Karen Livescu, Alexander H. Liu