Paper ID: 2206.09827

A Distributional Approach for Soft Clustering Comparison and Evaluation

Andrea Campagner, Davide Ciucci, Thierry Denœux

The development of external evaluation criteria for soft clustering (SC) has received limited attention: existing methods do not provide a general approach to extend comparison measures to SC, and are unable to account for the uncertainty represented in the results of SC algorithms. In this article, we propose a general method to address these limitations, grounding on a novel interpretation of SC as distributions over hard clusterings, which we call \emph{distributional measures}. We provide an in-depth study of complexity- and metric-theoretic properties of the proposed approach, and we describe approximation techniques that can make the calculations tractable. Finally, we illustrate our approach through a simple but illustrative experiment.

Submitted: Jun 20, 2022