Lingual Voice Conversion

Lingual voice conversion focuses on modifying a speaker's voice to match a target speaker's characteristics, while preserving the original speech content and potentially translating across languages. Current research emphasizes developing robust models that disentangle speaker identity from speech content, often employing cycle-consistent architectures and variational autoencoders (VAEs) to achieve this separation, even with limited multilingual data. This field is significant for improving speech-to-speech translation systems, augmenting speech recognition datasets (especially for under-resourced languages), and mitigating privacy concerns in voice-based applications by avoiding direct voice cloning.

Papers

August 8, 2024

MulliVC: Multi-lingual Voice Conversion With Cycle Consistency
Jiawei Huang, Chen Zhang, Yi Ren, Ziyue Jiang, Zhenhui Ye, Jinglin Liu, Jinzheng He, Xiang Yin, Zhou Zhao
Voice Conversion Multilingual Dataset Monolingual Data Cycle Consistency Lingual Voice Conversion

July 18, 2024

Preset-Voice Matching for Privacy Regulated Speech-to-Speech Translation Systems
Daniel Platnick, Bishoy Abdelnour, Eamon Earl, Rahul Kumar, Zahra Rezaei, Thomas Tsangaris, Faraj Lagum
Privacy Policy Speech to Speech Translation Direct S2ST Lingual Voice Conversion

June 12, 2024

Improving child speech recognition with augmented child-like speech
Yuanyuan Zhang, Zhengjun Yue, Tanvina Patel, Odette Scharenborg
Child Speech Child Speech Recognition Lingual Voice Conversion

October 10, 2023

AutoCycle-VC: Towards Bottleneck-Independent Zero-Shot Cross-Lingual Voice Conversion
Haeyun Choi, Jio Gim, Yuho Lee, Youngin Kim, Young-Joo Suh
Synthesized Speech Speech Encoder Zero Shot Voice Conversion Speaker Independent Lingual Voice Conversion

October 31, 2022

Joint Pre-Training with Speech and Bilingual Text for Direct Speech to Speech Translation
Kun Wei, Long Zhou, Ziqiang Zhang, Liping Chen, Shujie Liu, Lei He, Jinyu Li, Furu Wei
Speech Analysis Speech Translation Speech to Speech Translation Direct Speech to Speech Translation Joint Training Lingual Voice Conversion

October 25, 2022

Disentangled Speech Representation Learning for One-Shot Cross-lingual Voice Conversion Using $\beta$-VAE
Hui Lu, Disong Wang, Xixin Wu, Zhiyong Wu, Xunying Liu, Helen Meng
Disentangled Representation Speaker Representation Speech Representation Disentanglement Lingual Voice Conversion

March 29, 2022

ASR data augmentation in low-resource settings using cross-lingual multi-speaker TTS and cross-lingual voice conversion
Edresson Casanova, Christopher Shulby, Alexander Korolev, Arnaldo Candido Junior, Anderson da Silva Soares, Sandra Aluísio, Moacir Antonelli Ponti
Automatic Speech Recognition Speech Synthesis Low Resource Voice Conversion Cross Lingual Text to Speech Lingual Voice Conversion

Lingual Voice Conversion

Papers

MulliVC: Multi-lingual Voice Conversion With Cycle Consistency

Preset-Voice Matching for Privacy Regulated Speech-to-Speech Translation Systems

Improving child speech recognition with augmented child-like speech

AutoCycle-VC: Towards Bottleneck-Independent Zero-Shot Cross-Lingual Voice Conversion

Joint Pre-Training with Speech and Bilingual Text for Direct Speech to Speech Translation

Disentangled Speech Representation Learning for One-Shot Cross-lingual Voice Conversion Using $\beta$-VAE

ASR data augmentation in low-resource settings using cross-lingual multi-speaker TTS and cross-lingual voice conversion