Phonetic PosteriorGrams
Phonetic PosteriorGrams (PPGs) are probabilistic representations of speech sounds, offering a speaker-independent way to encode pronunciation information crucial for tasks like voice conversion and speech editing. Current research focuses on improving PPG quality for higher-fidelity speech synthesis, exploring efficient architectures for real-time applications (e.g., low-latency streaming voice conversion), and investigating alternatives to traditional PPGs, such as self-supervised speech representations. This work has significant implications for various applications, including voice assistants, accessibility technologies for individuals with speech impairments, and media production tools.
Papers
August 5, 2024
February 27, 2024
October 4, 2023
June 1, 2023
October 27, 2022