Paper ID: 2306.12461
On-orbit model training for satellite imagery with label proportions
Raúl Ramos-Pollán, Fabio A. González
This work addresses the challenge of training supervised machine or deep learning models on orbiting platforms where we are generally constrained by limited on-board hardware capabilities and restricted uplink bandwidths to upload. We aim at enabling orbiting spacecrafts to (1) continuously train a lightweight model as it acquires imagery; and (2) receive new labels while on orbit to refine or even change the predictive task being trained. For this, we consider chip level regression tasks (i.e. predicting the vegetation percentage of a 20 km$^2$ patch) when we only have coarser label proportions, such as municipality level vegetation statistics (a municipality containing several patches). Such labels proportions have the additional advantage that usually come in tabular data and are widely available in many regions of the world and application areas. This can be framed as a Learning from Label Proportions (LLP) problem setup. LLP applied to Earth Observation (EO) data is still an emerging field and performing comparative studies in applied scenarios remains a challenge due to the lack of standardized datasets. In this work, first, we show how very simple deep learning and probabilistic methods (with {\raise.17ex\hbox{$\scriptstyle\sim$}}5K parameters) generally perform better than standard more complex ones, providing a surprising level of finer grained spatial detail when trained with much coarser label proportions. Second, we publish a set of benchmarking datasets enabling comparative LLP applied to EO, providing both fine grained labels and aggregated data according to existing administrative divisions. Finally, we show how this approach fits an on-orbit training scenario by reducing vastly both the amount of computing and the size of the labels sets. Source code is available at https://github.com/rramosp/llpeo
Submitted: Jun 21, 2023