Sequence Transducer
Sequence transducers are neural network architectures designed for efficient and accurate sequence-to-sequence mapping, primarily used in speech recognition and other time-series processing tasks. Current research focuses on improving efficiency through techniques like frame-level criteria, optimized decoding algorithms (e.g., label-looping), and model compression methods (e.g., knowledge distillation), while also exploring novel architectures such as Conformer-T and CIF-T to enhance performance and reduce latency. These advancements are significant because they enable faster, more accurate, and resource-efficient applications in areas like speech recognition, machine translation, and recommendation systems, particularly on resource-constrained devices.
Papers
Transformers as Transducers
Lena Strobl, Dana Angluin, David Chiang, Jonathan Rawski, Ashish Sabharwal
Effective internal language model training and fusion for factorized transducer model
Jinxi Guo, Niko Moritz, Yingyi Ma, Frank Seide, Chunyang Wu, Jay Mahadeokar, Ozlem Kalinli, Christian Fuegen, Mike Seltzer