Lip Reading

Lip reading, the task of understanding speech from only lip movements, aims to improve communication accessibility and human-computer interaction. Current research focuses on developing robust models that generalize across speakers, employing techniques like speaker adaptation, optical flow guidance for smoother video generation, and multi-scale/multi-encoder architectures to better capture visual speech nuances. These advancements leverage both visual and geometric features, sometimes incorporating audio information for improved accuracy, and are evaluated using metrics like word error rate. The field's progress holds significant implications for assisting individuals with hearing impairments and enhancing various applications requiring silent speech recognition.

Papers