Medical Datasets
Medical datasets are crucial for training effective machine learning models in healthcare, but their inherent privacy concerns and heterogeneity pose significant challenges. Current research focuses on developing methods to mitigate these issues, including techniques for synthesizing private-preserving datasets, distilling large datasets into smaller, representative subsets, and employing federated learning to train models collaboratively across multiple institutions without directly sharing sensitive data. These advancements are vital for improving the accuracy and generalizability of AI models in healthcare, ultimately leading to better diagnostics, treatment, and resource allocation.
Papers
Leveraging Prompt-Learning for Structured Information Extraction from Crohn's Disease Radiology Reports in a Low-Resource Language
Liam Hazan, Gili Focht, Naama Gavrielov, Roi Reichart, Talar Hagopian, Mary-Louise C. Greer, Ruth Cytter Kuint, Dan Turner, Moti Freiman
MMIST-ccRCC: A Real World Medical Dataset for the Development of Multi-Modal Systems
Tiago Mota, M. Rita Verdelho, Alceu Bissoto, Carlos Santiago, Catarina Barata