Paper ID: 2405.15983

Hierarchical Clustering via Local Search

Hossein Jowhari

In this paper, we introduce a local search algorithm for hierarchical clustering. For the local step, we consider a tree re-arrangement operation, known as the {\em interchange}, which involves swapping two closely positioned sub-trees within a tree hierarchy. The interchange operation has been previously used in the context of phylogenetic trees. As the objective function for evaluating the resulting hierarchies, we utilize the revenue function proposed by Moseley and Wang (NIPS 2017.) In our main result, we show that any locally optimal tree guarantees a revenue of at least $\frac{n-2}{3}\sum_{i < j}w(i,j)$ where is $n$ the number of objects and $w: [n] \times [n] \rightarrow \mathbb{R}^+$ is the associated similarity function. This finding echoes the previously established bound for the average link algorithm as analyzed by Moseley and Wang. We demonstrate that this alignment is not coincidental, as the average link trees enjoy the property of being locally optimal with respect to the interchange operation. Consequently, our study provides an alternative insight into the average link algorithm and reveals the existence of a broader range of hierarchies with relatively high revenue achievable through a straightforward local search algorithm. Furthermore, we present an implementation of the local search framework, where each local step requires $O(n)$ computation time. Our empirical results indicate that the proposed method, used as post-processing step, can effectively generate a hierarchical clustering with substantial revenue.

Submitted: May 24, 2024