Paper ID: 2305.12703

Progressive Sub-Graph Clustering Algorithm for Semi-Supervised Domain Adaptation Speaker Verification

Zhuo Li, Jingze Lu, Zhenduo Zhao, Wenchao Wang, Pengyuan Zhang

Utilizing the large-scale unlabeled data from the target domain via pseudo-label clustering algorithms is an important approach for addressing domain adaptation problems in speaker verification tasks. In this paper, we propose a novel progressive subgraph clustering algorithm based on multi-model voting and double-Gaussian based assessment (PGMVG clustering). To fully exploit the relationships among utterances and the complementarity among multiple models, our method constructs multiple k-nearest neighbors graphs based on diverse models and generates high-confidence edges using a voting mechanism. Further, to maximize the intra-class diversity, the connected subgraph is utilized to obtain the initial pseudo-labels. Finally, to prevent disastrous clustering results, we adopt an iterative approach that progressively increases k and employs a double-Gaussian based assessment algorithm to decide whether merging sub-classes.

Submitted: May 22, 2023