Speaker Normalization
Speaker normalization aims to remove speaker-specific characteristics from speech signals, leaving only the core linguistic or emotional content. Current research focuses on developing methods that disentangle speaker and phonetic information within self-supervised speech representations, often employing techniques like principal component analysis or variational autoencoders, and leveraging discrete speech units for improved efficiency. These advancements are crucial for improving the robustness and generalizability of speech processing systems across diverse speakers, with applications ranging from speech synthesis and translation to assisting individuals with dysarthria and mitigating online hate speech.
Papers
January 26, 2024
May 21, 2023
December 13, 2022
July 26, 2022
June 8, 2022
February 2, 2022
December 15, 2021