Speech Language Model
Speech language models (SLMs) aim to directly process and generate speech, bypassing the traditional text-based intermediary steps of automatic speech recognition and text-to-speech. Current research focuses on improving SLM architectures, such as hierarchical transformers and encoder-decoder models, often incorporating techniques like self-supervised learning, knowledge distillation, and prompt engineering to enhance efficiency and performance on tasks including speech translation, synthesis, and question answering. These advancements hold significant potential for creating more natural and intuitive human-computer interaction, particularly in applications requiring real-time speech processing and generation.
Papers
June 6, 2024
June 3, 2024
May 30, 2024
May 16, 2024
April 8, 2024
March 31, 2024
March 19, 2024
February 19, 2024
February 2, 2024
January 31, 2024
November 8, 2023
October 25, 2023
October 23, 2023
October 16, 2023
September 30, 2023
September 15, 2023
August 31, 2023
June 22, 2023
June 3, 2023
June 2, 2023