Speaker Attractor

Speaker attractors are high-dimensional embeddings used in neural networks to perform speaker diarization and related tasks like voice anti-spoofing and text-to-speech synthesis. Current research focuses on improving the efficiency and robustness of these attractor-based models, often employing transformer architectures and incorporating conversational context or target speaker information to enhance performance. These advancements lead to more accurate and efficient speaker separation in audio, impacting applications ranging from meeting transcription to improved voice authentication systems.

Papers