Paper ID: 2412.16713 • Published Dec 21, 2024
A Unifying Family of Data-Adaptive Partitioning Algorithms
Guy B. Oldaker IV, Maria Emelianenko
TL;DR
Get AI-generated summaries with premium
Get AI-generated summaries with premium
Clustering algorithms remain valuable tools for grouping and summarizing the
most important aspects of data. Example areas where this is the case include
image segmentation, dimension reduction, signals analysis, model order
reduction, numerical analysis, and others. As a consequence, many clustering
approaches have been developed to satisfy the unique needs of each particular
field. In this article, we present a family of data-adaptive partitioning
algorithms that unifies several well-known methods (e.g., k-means and
k-subspaces). Indexed by a single parameter and employing a common minimization
strategy, the algorithms are easy to use and interpret, and scale well to
large, high-dimensional problems. In addition, we develop an adaptive mechanism
that (a) exhibits skill at automatically uncovering data structures and problem
parameters without any expert knowledge and, (b) can be used to augment other
existing methods. By demonstrating the performance of our methods on examples
from disparate fields including subspace clustering, model order reduction, and
matrix approximation, we hope to highlight their versatility and potential for
extending the boundaries of existing scientific domains. We believe our
family's parametrized structure represents a synergism of algorithms that will
foster new developments and directions, not least within the data science
community.