Medical Vision

Medical vision research focuses on developing AI systems that can understand and interpret medical images and associated textual data (e.g., radiology reports) to improve diagnosis and treatment. Current research emphasizes multimodal vision-language pre-training (Med-VLP) using transformer-based architectures, often incorporating techniques like contrastive learning, masked autoencoders, and federated learning to address data scarcity and heterogeneity. These advancements aim to create robust and fair models capable of handling diverse prompt styles and improving performance across various downstream medical tasks, such as image classification, segmentation, and report generation, ultimately assisting clinicians in their decision-making processes.

Papers