Cross Modal Distillation

Cross-modal distillation aims to improve the performance of models trained on a limited or less informative modality (e.g., images from event cameras, sparse point clouds) by leveraging knowledge from a richer modality (e.g., LiDAR data, high-resolution images). Current research focuses on developing effective distillation strategies, often employing techniques like contrastive learning, attention mechanisms, and adaptive fusion methods within various architectures, including vision transformers and dual-encoder models. This approach is particularly valuable in scenarios with limited labeled data or high annotation costs, impacting diverse fields such as medical image analysis, autonomous driving, and industrial anomaly detection. The resulting improvements in model accuracy and efficiency have significant implications for practical applications and advance the understanding of knowledge transfer across different data representations.

Papers