Paper ID: 2408.11517

Imagining from Images with an AI Storytelling Tool

Edirlei Soares de Lima, Marco A. Casanova, Antonio L. Furtado

A method for generating narratives by analyzing single images or image sequences is presented, inspired by the time immemorial tradition of Narrative Art. The proposed method explores the multimodal capabilities of GPT-4o to interpret visual content and create engaging stories, which are illustrated by a Stable Diffusion XL model. The method is supported by a fully implemented tool, called ImageTeller, which accepts images from diverse sources as input. Users can guide the narrative's development according to the conventions of fundamental genres - such as Comedy, Romance, Tragedy, Satire or Mystery -, opt to generate data-driven stories, or to leave the prototype free to decide how to handle the narrative structure. User interaction is provided along the generation process, allowing the user to request alternative chapters or illustrations, and even reject and restart the story generation based on the same input. Additionally, users can attach captions to the input images, influencing the system's interpretation of the visual content. Examples of generated stories are provided, along with details on how to access the prototype.

Submitted: Aug 21, 2024