Joint Pre-Training with Speech and Bilingual Text for Direct Speech to Speech Translation [2210.17027]