Seq2seq Model
Sequence-to-sequence (Seq2seq) models are neural network architectures designed to translate input sequences into output sequences, addressing tasks like machine translation and text summarization. Current research focuses on improving Seq2seq performance through architectural innovations (e.g., Transformers, LSTMs) and training methodologies such as bidirectional awareness induction and knowledge distillation, particularly for low-resource scenarios. These advancements are significantly impacting various fields, enabling improvements in natural language processing, medical image analysis, and other areas requiring sequence-to-sequence transformations.
Papers
Recipes for Sequential Pre-training of Multilingual Encoder and Seq2Seq Models
Saleh Soltan, Andy Rosenbaum, Tobias Falke, Qin Lu, Anna Rumshisky, Wael Hamza
EM-Network: Oracle Guided Self-distillation for Sequence Learning
Ji Won Yoon, Sunghwan Ahn, Hyeonseung Lee, Minchan Kim, Seok Min Kim, Nam Soo Kim
T5-SR: A Unified Seq-to-Seq Decoding Strategy for Semantic Parsing
Yuntao Li, Zhenpeng Su, Yutian Li, Hanchu Zhang, Sirui Wang, Wei Wu, Yan Zhang