Multi Modal Colorization

Multi-modal colorization aims to automatically add realistic color to grayscale images, leveraging diverse input modalities like text descriptions, user-drawn strokes, and example colors to guide the process. Recent research heavily utilizes diffusion models and transformers, often incorporating pre-trained networks like Stable Diffusion and CLIP to enhance color accuracy, diversity, and user control. This field is significant for improving image quality and accessibility, with applications ranging from enhancing historical photographs to assisting in medical image analysis and interactive art creation.

Papers