Text Driven Generation
Text-driven generation focuses on creating various outputs, such as images, 3D models, and even text itself, based solely on textual descriptions. Current research heavily utilizes diffusion models and large vision-language models like CLIP, often incorporating multi-modal guidance (e.g., images, 3D shapes) to enhance control and realism in the generated content. This field is significant for its potential to automate complex creative processes, enabling zero-shot generation of diverse outputs and facilitating new applications in areas like digital art, animation, and material science.
Papers
April 23, 2024
March 22, 2024
March 14, 2023
July 14, 2022
May 17, 2022