Hard Clustering

Hard clustering aims to assign data points to distinct, non-overlapping groups, optimizing for criteria like minimizing within-cluster distances or maximizing inter-cluster separation. Recent research emphasizes developing robust algorithms, such as variations of k-means and Bregman clustering, that address challenges like imbalanced data, high dimensionality, and the selection of the optimal number of clusters, often incorporating techniques from information theory and optimal transport. These advancements improve clustering accuracy and efficiency across diverse applications, including federated learning, network analysis, and large language model compression, by enabling more effective data summarization and representation learning. Furthermore, research is focused on developing better evaluation metrics for comparing different clustering results, particularly in the context of fuzzy clustering.

Papers