Paper ID: 2501.07183
Kriging and Gaussian Process Interpolation for Georeferenced Data Augmentation
Frédérick Fabre Ferber (LIM, UPR Recyclage et risque), Dominique Gay (LIM), Jean-Christophe Soulié (UPR Recyclage et risque), Jean Diatta (LIM), Odalric-Ambrym Maillard (Scool)
Data augmentation is a crucial step in the development of robust supervised learning models, especially when dealing with limited datasets. This study explores interpolation techniques for the augmentation of geo-referenced data, with the aim of predicting the presence of Commelina benghalensis L. in sugarcane plots in La R{\'e}union. Given the spatial nature of the data and the high cost of data collection, we evaluated two interpolation approaches: Gaussian processes (GPs) with different kernels and kriging with various variograms. The objectives of this work are threefold: (i) to identify which interpolation methods offer the best predictive performance for various regression algorithms, (ii) to analyze the evolution of performance as a function of the number of observations added, and (iii) to assess the spatial consistency of augmented datasets. The results show that GP-based methods, in particular with combined kernels (GP-COMB), significantly improve the performance of regression algorithms while requiring less additional data. Although kriging shows slightly lower performance, it is distinguished by a more homogeneous spatial coverage, a potential advantage in certain contexts.
Submitted: Jan 13, 2025