Visual Imagination

Visual imagination research explores how computational models can generate and manipulate visual representations, mirroring human cognitive abilities. Current efforts focus on integrating vision-language models with generative AI, leveraging architectures like diffusion transformers and variational autoencoders to create and edit 3D assets and images from textual or visual prompts, even under ambiguous conditions. This work aims to improve the understanding of visual cognition and has implications for diverse applications, including 3D modeling, graphic design, and enhancing the capabilities of AI systems in tasks requiring visual reasoning and creative problem-solving.

Papers