Paper ID: 2304.07050

Lossy Compression of Large-Scale Radio Interferometric Data

M Atemkeng, S Perkins, E Seck, S Makhathini, O Smirnov, L Bester, B Hugo

This work proposes to reduce visibility data volume using a baseline-dependent lossy compression technique that preserves smearing at the edges of the field-of-view. We exploit the relation of the rank of a matrix and the fact that a low-rank approximation can describe the raw visibility data as a sum of basic components where each basic component corresponds to a specific Fourier component of the sky distribution. As such, the entire visibility data is represented as a collection of data matrices from baselines, instead of a single tensor. The proposed methods are formulated as follows: provided a large dataset of the entire visibility data; the first algorithm, named $simple~SVD$ projects the data into a regular sampling space of rank$-r$ data matrices. In this space, the data for all the baselines has the same rank, which makes the compression factor equal across all baselines. The second algorithm, named $BDSVD$ projects the data into an irregular sampling space of rank$-r_{pq}$ data matrices. The subscript $pq$ indicates that the rank of the data matrix varies across baselines $pq$, which makes the compression factor baseline-dependent. MeerKAT and the European Very Long Baseline Interferometry Network are used as reference telescopes to evaluate and compare the performance of the proposed methods against traditional methods, such as traditional averaging and baseline-dependent averaging (BDA). For the same spatial resolution threshold, both $simple~SVD$ and $BDSVD$ show effective compression by two-orders of magnitude higher than traditional averaging and BDA. At the same space-saving rate, there is no decrease in spatial resolution and there is a reduction in the noise variance in the data which improves the S/N to over $1.5$ dB at the edges of the field-of-view.

Submitted: Apr 14, 2023