Text to Image Diffusion
Text-to-image diffusion models generate images from textual descriptions by iteratively refining noise into a coherent image, leveraging the power of large pretrained models and diffusion processes. Current research focuses on improving controllability (e.g., viewpoint, object attributes, style), efficiency (e.g., one-step generation, reduced parameter counts), and addressing issues like overfitting and bias in personalized models, often employing techniques like parameter-efficient fine-tuning, reinforcement learning, and multi-modal conditioning with other data sources (e.g., audio, depth maps). These advancements are significantly impacting various fields, including computer graphics, digital art, and content creation, by enabling more efficient and controllable image synthesis.
Papers
UVMap-ID: A Controllable and Personalized UV Map Generative Model
Weijie Wang, Jichao Zhang, Chang Liu, Xia Li, Xingqian Xu, Humphrey Shi, Nicu Sebe, Bruno Lepri
Infusion: Preventing Customized Text-to-Image Diffusion from Overfitting
Weili Zeng, Yichao Yan, Qi Zhu, Zhuo Chen, Pengzhi Chu, Weiming Zhao, Xiaokang Yang