Audiobook Speech Synthesis
Audiobook speech synthesis aims to automatically generate high-quality, expressive audiobooks from text, addressing the significant time and cost associated with traditional human narration. Current research focuses on improving the naturalness and expressiveness of synthesized speech, particularly by leveraging advanced neural network architectures like variational autoencoders (VAEs) and hierarchical transformers to model complex stylistic variations within and across sentences and paragraphs. This work is significant for enhancing accessibility to literature and fostering innovation in text-to-speech technology, with recent efforts resulting in large-scale open-source audiobook collections and interactive creation tools.
Papers
December 19, 2023
September 7, 2023
August 25, 2023
April 13, 2023
November 4, 2022