Acoustic Token
Acoustic tokenization represents the process of converting continuous audio signals into discrete units for processing by machine learning models, primarily focusing on improving the performance of audio language models (ALMs). Current research emphasizes developing more effective tokenization methods that better preserve semantic information, often employing transformer-based architectures and exploring techniques like residual vector quantization and mel-filterbank discretization. This work is crucial for advancing various audio applications, including speech recognition, speech synthesis, music generation, and voice conversion, by enabling more accurate and efficient processing of audio data.
Papers
August 30, 2024
July 22, 2024
February 3, 2024
September 15, 2023
July 10, 2023
June 18, 2023
December 18, 2022
November 17, 2022