Sign Language Video

Sign language video research focuses on bridging the communication gap between deaf and hearing communities by developing computational methods to understand and translate sign language from video. Current research emphasizes multimodal approaches, integrating visual (RGB video, pose estimation) and linguistic information (glosses, text) using transformer networks, diffusion models, and other deep learning architectures to improve tasks like sign recognition, translation, and retrieval. These advancements are crucial for creating accessible technologies such as real-time sign language translation systems, video search engines, and tools for anonymizing sign language videos while preserving linguistic content, ultimately improving the lives of deaf individuals.

Papers