Multimodal Manga Complement

Multimodal manga complement (M2C) research aims to automatically reconstruct missing or damaged information in manga, leveraging both visual and textual data to create a complete and coherent narrative. Current efforts focus on developing models that integrate visual scene understanding with natural language processing techniques, often employing large language models and fine-grained visual prompting to infer missing textual content or reconstruct damaged panels. This work is significant because it addresses the preservation and accessibility of manga, potentially improving both the study of this art form and the experience of its readers.

Papers