Child Adult Speaker
Child-adult speaker identification and profiling are active research areas focusing on accurately distinguishing child and adult voices within audio and video recordings of dyadic interactions. Current research leverages deep learning models, including speech foundation models and wav2vec 2.0, often incorporating multi-task learning and multimodal (audio-visual) approaches to improve accuracy and robustness, particularly in challenging scenarios like noisy home environments. These advancements have significant implications for applications requiring automated analysis of child-adult interactions, such as educational technology, clinical diagnostics, and the development of more personalized conversational agents.
Papers
June 12, 2024
April 18, 2024
October 3, 2023
May 21, 2023