Short Text Clustering

Short text clustering aims to group semantically similar short texts automatically, a challenging task due to data sparsity and noise. Recent research focuses on improving robustness through techniques like optimal transport for pseudo-label generation, contrastive learning to enhance representation learning, and iterative learning frameworks that combine clustering with classification objectives. These advancements leverage large language models and federated learning to address issues of data imbalance, noise, and privacy concerns, leading to more accurate and interpretable clustering results with applications in various fields like social media analysis and information retrieval.

Papers