Story Visualization

Story visualization focuses on generating images or videos that coherently depict textual narratives, aiming to improve comprehension and engagement with stories across various fields. Current research heavily utilizes large language models (LLMs) in conjunction with diffusion models and transformers to generate visually consistent and contextually relevant image sequences, often incorporating techniques like spatial-temporal attention and character-centric modeling to enhance coherence. This field is significant for advancing human-computer interaction, improving data storytelling, and creating more engaging educational and entertainment experiences, with ongoing efforts to improve efficiency, reduce computational costs, and enhance the controllability of the generated visualizations.

Papers