Change Captioning

Change captioning, the task of automatically generating textual descriptions of changes depicted in image pairs (e.g., remote sensing images over time), aims to improve the understanding and analysis of visual differences. Current research focuses on enhancing the accuracy and robustness of these descriptions, often employing multimodal architectures that integrate large language models with convolutional neural networks or transformers, and incorporating techniques like attention mechanisms and multi-scale feature extraction to better capture relevant details. This field is significant for applications in remote sensing, medical imaging, and assistive technologies for the visually impaired, offering automated analysis of change and improved accessibility to visual information.

Papers