Paper ID: 2112.12021
Community Detection in Medical Image Datasets: Using Wavelets and Spectral Methods
Roozbeh Yousefzadeh
Medical image datasets can have large number of images representing patients with different health conditions and various disease severity. When dealing with raw unlabeled image datasets, the large number of samples often makes it hard for experts and non-experts to understand the variety of images present in a dataset. Supervised learning methods rely on labeled images which requires a considerable effort by medical experts to first understand the communities of images present in the data and then labeling the images. Here, we propose an algorithm to facilitate the automatic identification of communities in medical image datasets. We further demonstrate that such analysis can be insightful in a supervised setting when the images are already labeled. Such insights are useful because, health and disease severity can be considered a continuous spectrum, and within each class, there usually are finer communities worthy of investigation, especially when they have similarities to communities in other classes. In our approach, we use wavelet decomposition of images in tandem with spectral methods. We show that the eigenvalues of a graph Laplacian can reveal the number of notable communities in an image dataset. Moreover, analyzing the similarities may be used to infer a spectrum representing the severity of the disease. In our experiments, we use a dataset of images labeled with different conditions for COVID patients. We detect 25 communities in the dataset and then observe that only 6 of those communities contain patients with pneumonia. We also investigate the contents of a colorectal cancer histology dataset.
Submitted: Dec 22, 2021