Speech Tokenization
Speech tokenization aims to represent continuous speech signals as discrete units, enabling the application of powerful language modeling techniques to audio data. Current research focuses on developing effective tokenization methods, often employing vector quantization (VQ) and transformer architectures, and evaluating their performance across various downstream tasks like speech recognition and synthesis, using benchmarks to compare different approaches. Improved tokenization methods are crucial for advancing speech language models, leading to more robust and efficient systems for applications ranging from speech-to-text to audio generation.
Papers
November 15, 2024
October 31, 2024
October 22, 2024
October 19, 2024
September 5, 2024
September 4, 2024
August 30, 2024
August 18, 2024
August 16, 2024
July 22, 2024
June 16, 2024
June 15, 2024
June 8, 2024
June 5, 2024
April 3, 2024
March 31, 2024
September 19, 2023
August 31, 2023
September 7, 2022