Speech Translation

May 12, 2022

Controlling Formality in Low-Resource NMT with Domain Adaptation and Re-Ranking: SLT-CDT-UoS at IWSLT2022
Sebastian T. Vincent, Loïc Barrault, Carolina Scarton
Domain Adaptation Large Corpus Speech Translation Language Agnostic Ranking Step NMT System Natural Language Annotation Formality Distribution

May 5, 2022

Efficient yet Competitive Speech Translation: FBK@IWSLT2022
Marco Gaido, Sara Papi, Dennis Fucci, Giuseppe Fiameni, Matteo Negri, Marco Turchi
Training Data High Efficiency Speech Translation Simultaneous Speech Translation Lightweight Continual
Cross-modal Contrastive Learning for Speech Translation
Rong Ye, Mingxuan Wang, Lei Li
Speech Translation Cross Modal Retrieval Unified Representation Cross Modal Contrastive Learning End Speech to Text Translation

May 4, 2022

ON-TRAC Consortium Systems for the IWSLT 2022 Dialect and Low-resource Speech Translation Tasks
Marcely Zanon Boito, John Ortega, Hugo Riguidel, Antoine Laurent, Loïc Barrault, Fethi Bougares, Firas Chaabani, Ha Nguyen, Florentin Barbier, Souhir Gahbiche, Yannick Estève
Automatic Speech Recognition Speech Translation Language Modeling Loss Dialect Speaker

April 22, 2022

LibriS2S: A German-English Speech-to-Speech Translation Corpus
Pedro Jeuris, Jan Niehues
Speech Translation Speech Corpus Text to Speech Model Speech to Speech Translation Speech Translation Corpus

April 19, 2022

Blockwise Streaming Transformer for Spoken Language Understanding and Simultaneous Speech Translation
Keqi Deng, Shinji Watanabe, Jiatong Shi, Siddhant Arora
Automatic Speech Recognition Cross Lingual Speech Translation Spoken Language Understanding Simultaneous Speech Translation Speech Processing Task Streaming Transformer

April 11, 2022

April 8, 2022

GigaST: A 10,000-hour Pseudo Speech Translation Corpus
Rong Ye, Chengqi Zhao, Tom Ko, Chutong Meng, Tao Wang, Mingxuan Wang, Jun Cao
Speech Translation Gigapixel Image Speech Translation Corpus Speech Recognition Corpus

March 29, 2022

Speech Segmentation Optimization using Segmented Bilingual Speech Corpus for End-to-end Speech Translation
Ryo Fukuda, Katsuhito Sudoh, Satoshi Nakamura
Speech Translation End to End Speech Translation Speech Segmentation Inappropriate Pause

March 28, 2022

Multilingual Simultaneous Speech Translation
Shashank Subramanya, Jan Niehues
Multilingual Model Speech Translation Multilingual Corpus Simultaneous Speech Translation Offline Speech Translation

March 22, 2022

Exploring Continuous Integrate-and-Fire for Adaptive Simultaneous Speech Translation
Chih-Chiang Chang, Hung-yi Lee
Speech Translation Simultaneous Speech Translation Adaptive Policy

March 20, 2022

STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation
Qingkai Fang, Rong Ye, Lei Li, Yang Feng, Mingxuan Wang
Speech Representation Speech Translation Multimodal Sequence End Speech to Text Translation Speech Translation Benchmark Speech Text Manifold Mixup

March 18, 2022

Under the Morphosyntactic Lens: A Multifaceted Evaluation of Gender Bias in Speech Translation
Beatrice Savoldi, Marco Gaido, Luisa Bentivogli, Matteo Negri, Marco Turchi
Machine Translation Gender Bias Speech Translation Morphosyntactic Analysis

March 16, 2022

Sample, Translate, Recombine: Leveraging Audio Alignments for Data Augmentation in End-to-end Speech Translation
Tsz Kin Lam, Shigehiko Schamoni, Stefan Riezler
Data Augmentation Speech Translation Language Pair End to End Speech Translation Source Speech Synthetic Data Augmentation Model Recombination Audio Alignment

March 4, 2022

From Simultaneous to Streaming Machine Translation by Leveraging Streaming History
Javier Iranzo-Sánchez, Jorge Civera, Alfons Juan
Sentence Level Speech Translation Simultaneous Machine Translation Online Streaming Simultaneous Translation Concurrent Transmission Monotonic Translation

February 11, 2022

Evaluating MT Systems: A Theoretical Framework
Rajeev Sangal
Speech Translation Natural Language Description Machine Translation System Automatic Metric Theoretical Framework MT Evaluation

February 3, 2022

mSLAM: Massively multilingual joint pre-training for speech and text
Ankur Bapna, Colin Cherry, Yu Zhang, Ye Jia, Melvin Johnson, Yong Cheng, Simran Khanuja, Jason Riesa, Alexis Conneau
Text Modality Speech Analysis Speech Translation Multilingual Automatic Speech Recognition Multilingual Speech Multilingual Pretraining Speech Translation Model Cross Lingual Cross Modal

January 27, 2022

Prabhupadavani: A Code-mixed Speech Translation Data for 25 Languages
Jivnesh Sandhan, Ayush Daksh, Om Adideva Paranjay, Laxmidhar Behera, Pawan Goyal
Speech Translation Unknown Language Code Mixed

Papers

Controlling Formality in Low-Resource NMT with Domain Adaptation and Re-Ranking: SLT-CDT-UoS at IWSLT2022

Efficient yet Competitive Speech Translation: FBK@IWSLT2022

Cross-modal Contrastive Learning for Speech Translation

ON-TRAC Consortium Systems for the IWSLT 2022 Dialect and Low-resource Speech Translation Tasks

LibriS2S: A German-English Speech-to-Speech Translation Corpus

Blockwise Streaming Transformer for Spoken Language Understanding and Simultaneous Speech Translation

Unified Speech-Text Pre-training for Speech Translation and Recognition

Large-Scale Streaming End-to-End Speech Translation with Neural Transducers

End-to-End Speech Translation for Code Switched Speech

GigaST: A 10,000-hour Pseudo Speech Translation Corpus

Speech Segmentation Optimization using Segmented Bilingual Speech Corpus for End-to-end Speech Translation

Multilingual Simultaneous Speech Translation

Exploring Continuous Integrate-and-Fire for Adaptive Simultaneous Speech Translation

STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation

Under the Morphosyntactic Lens: A Multifaceted Evaluation of Gender Bias in Speech Translation

Sample, Translate, Recombine: Leveraging Audio Alignments for Data Augmentation in End-to-end Speech Translation

From Simultaneous to Streaming Machine Translation by Leveraging Streaming History

Evaluating MT Systems: A Theoretical Framework

mSLAM: Massively multilingual joint pre-training for speech and text

Prabhupadavani: A Code-mixed Speech Translation Data for 25 Languages