Non Speech Vocalization

Non-speech vocalizations (NSVs), encompassing sounds like laughter, sighs, and cries, are increasingly studied for their communicative and emotional content. Current research focuses on developing robust methods for NSV recognition and classification using machine learning techniques, including deep learning architectures like attention U-Nets and classifier chains, often leveraging large, newly-created datasets. This work is significant for advancing our understanding of human communication, with applications ranging from improved speech processing in audio engineering to aiding in the diagnosis and monitoring of conditions like autism spectrum disorder.

Papers