Speech Language Model
Speech language models (SLMs) aim to directly process and generate speech, bypassing the traditional text-based intermediary steps of automatic speech recognition and text-to-speech. Current research focuses on improving SLM architectures, such as hierarchical transformers and encoder-decoder models, often incorporating techniques like self-supervised learning, knowledge distillation, and prompt engineering to enhance efficiency and performance on tasks including speech translation, synthesis, and question answering. These advancements hold significant potential for creating more natural and intuitive human-computer interaction, particularly in applications requiring real-time speech processing and generation.
Papers
September 7, 2024
September 5, 2024
August 5, 2024
July 20, 2024
June 27, 2024
June 18, 2024
June 16, 2024
June 12, 2024
June 6, 2024
June 3, 2024
May 30, 2024
May 16, 2024
April 8, 2024
March 31, 2024
March 19, 2024
February 19, 2024
February 2, 2024
January 31, 2024
November 8, 2023