Cluster Number

Determining the optimal number of clusters in a dataset ("cluster number") is a critical, yet often unsolved, problem in unsupervised machine learning. Current research focuses on developing algorithms that automatically infer the cluster number, often integrating this process with the clustering algorithm itself, using techniques like self-supervised learning, reinforcement learning, and density-based methods within various model architectures including graph autoencoders and medoid-based approaches. These advancements aim to improve the efficiency and accuracy of clustering, impacting diverse fields from biomedical image analysis to social network analysis by enabling more robust and automated data exploration. The development of ensemble methods further addresses the challenge of algorithm and hyperparameter selection, streamlining the entire clustering process.

Papers