Paper ID: 2304.04068

Word-level Persian Lipreading Dataset

Javad Peymanfard, Ali Lashini, Samin Heydarian, Hossein Zeinali, Nasser Mozayani

Lip-reading has made impressive progress in recent years, driven by advances in deep learning. Nonetheless, the prerequisite such advances is a suitable dataset. This paper provides a new in-the-wild dataset for Persian word-level lipreading containing 244,000 videos from approximately 1,800 speakers. We evaluated the state-of-the-art method in this field and used a novel approach for word-level lip-reading. In this method, we used the AV-HuBERT model for feature extraction and obtained significantly better performance on our dataset.

Submitted: Apr 8, 2023

Topics

Deep Learning
Wild Datasets

Links

arXiv PDF