Speech Language Model
Speech language models (SLMs) aim to directly process and generate speech, bypassing the traditional text-based intermediary steps of automatic speech recognition and text-to-speech. Current research focuses on improving SLM architectures, such as hierarchical transformers and encoder-decoder models, often incorporating techniques like self-supervised learning, knowledge distillation, and prompt engineering to enhance efficiency and performance on tasks including speech translation, synthesis, and question answering. These advancements hold significant potential for creating more natural and intuitive human-computer interaction, particularly in applications requiring real-time speech processing and generation.
Papers
December 3, 2024
December 2, 2024
November 26, 2024
October 23, 2024
October 20, 2024
October 19, 2024
October 11, 2024
October 5, 2024
October 1, 2024
September 30, 2024
September 29, 2024
September 26, 2024
September 16, 2024
September 11, 2024
September 10, 2024
September 7, 2024
September 5, 2024
August 5, 2024
July 20, 2024