Text Based Speech Editing
Text-based speech editing aims to modify audio recordings by manipulating their corresponding text transcripts, offering a more intuitive and efficient alternative to manual waveform manipulation. Current research focuses on improving the naturalness and fluency of edited speech, often employing neural network architectures like transformers and diffusion models, and incorporating techniques such as context-aware prosody correction and semantic enrichment to enhance intelligibility and consistency. This field is significant for its potential to revolutionize audio and video production, enabling faster and more precise editing while also offering applications in accessibility technologies for individuals with speech impediments.
Papers
September 19, 2024
September 11, 2024
July 24, 2024
July 7, 2024
February 15, 2024
September 21, 2023
July 8, 2023
June 14, 2023
May 23, 2023
April 23, 2023
December 20, 2022
October 28, 2022