Transformer Transducer

Transformer Transducers are neural network architectures designed for sequence-to-sequence tasks, primarily focusing on improving the speed and accuracy of automatic speech recognition (ASR). Current research emphasizes enhancing efficiency through lightweight models, optimized decoding algorithms (like label-looping), and novel training strategies such as token-level loss functions and global normalization. These advancements aim to improve the accuracy and reduce the latency of streaming ASR, impacting both research through improved benchmarks and practical applications like real-time speech translation and keyword spotting.

Papers