Paper ID: 2411.11421

Towards fast DBSCAN via Spectrum-Preserving Data Compression

Yongyu Wang

This paper introduces a novel method to significantly accelerate DBSCAN by employing spectral data compression. The proposed approach reduces the size of the data set by a factor of five while preserving the essential clustering characteristics through an innovative spectral compression technique. This enables DBSCAN to run substantially faster without any loss of accuracy. Experiments on real-world data sets, such as USPS, demonstrate the method's capability to achieve this dramatic reduction in data size while maintaining clustering performance.

Submitted: Nov 18, 2024