Paper ID: 2406.01929

Fast networked data selection via distributed smoothed quantile estimation

Xu Zhang, Marcos M. Vasconcelos

Collecting the most informative data from a large dataset distributed over a network is a fundamental problem in many fields, including control, signal processing and machine learning. In this paper, we establish a connection between selecting the most informative data and finding the top-$k$ elements of a multiset. The top-$k$ selection in a network can be formulated as a distributed nonsmooth convex optimization problem known as quantile estimation. Unfortunately, the lack of smoothness in the local objective functions leads to extremely slow convergence and poor scalability with respect to the network size. To overcome the deficiency, we propose an accelerated method that employs smoothing techniques. Leveraging the piecewise linearity of the local objective functions in quantile estimation, we characterize the iteration complexity required to achieve top-$k$ selection, a challenging task due to the lack of strong convexity. Several numerical results are provided to validate the effectiveness of the algorithm and the correctness of the theory.

Submitted: Jun 4, 2024