Paper ID: 2410.16947
ISImed: A Framework for Self-Supervised Learning using Intrinsic Spatial Information in Medical Images
Nabil Jabareen, Dongsheng Yuan, Sören Lukassen
This paper demonstrates that spatial information can be used to learn interpretable representations in medical images using Self-Supervised Learning (SSL). Our proposed method, ISImed, is based on the observation that medical images exhibit a much lower variability among different images compared to classic data vision benchmarks. By leveraging this resemblance of human body structures across multiple images, we establish a self-supervised objective that creates a latent representation capable of capturing its location in the physical realm. More specifically, our method involves sampling image crops and creating a distance matrix that compares the learned representation vectors of all possible combinations of these crops to the true distance between them. The intuition is, that the learned latent space is a positional encoding for a given image crop. We hypothesize, that by learning these positional encodings, comprehensive image representations have to be generated. To test this hypothesis and evaluate our method, we compare our learned representation with two state-of-the-art SSL benchmarking methods on two publicly available medical imaging datasets. We show that our method can efficiently learn representations that capture the underlying structure of the data and can be used to transfer to a downstream classification task.
Submitted: Oct 22, 2024