Streaming Transformer
Streaming Transformers are deep learning models designed to process sequential data in a real-time, or near real-time, manner, addressing the limitations of traditional Transformers which struggle with long sequences. Current research focuses on adapting Transformer architectures, such as decoder-only models and those incorporating cumulative or blockwise attention mechanisms, for various applications including speech recognition, machine translation, and video understanding. This focus on efficient, low-latency processing significantly impacts fields like real-time audio processing and interactive systems, enabling faster and more responsive AI applications.
Papers
September 20, 2024
June 6, 2024
March 26, 2024
December 28, 2023
October 3, 2023
September 12, 2023
August 16, 2023
June 1, 2023
May 7, 2023
March 30, 2023
January 19, 2023
November 8, 2022
May 18, 2022
April 19, 2022