T2I Diffusion Model
Text-to-image (T2I) diffusion models generate images from textual descriptions, leveraging advancements in diffusion probabilistic models and transformer architectures. Current research emphasizes improving the trustworthiness, controllability, and efficiency of these models, focusing on addressing issues like bias, safety, and compositional generation through techniques such as adversarial training, prompt engineering, and multimodal conditioning (e.g., incorporating sketches or 3D layouts). This rapidly evolving field has significant implications for creative content generation, medical imaging, and other applications requiring high-quality, customizable image synthesis, while also raising important ethical considerations regarding bias and misuse.
Papers
Editing Massive Concepts in Text-to-Image Diffusion Models
Tianwei Xiong, Yue Wu, Enze Xie, Yue Wu, Zhenguo Li, Xihui Liu
AGFSync: Leveraging AI-Generated Feedback for Preference Optimization in Text-to-Image Generation
Jingkun An, Yinghao Zhu, Zongjian Li, Enshen Zhou, Haoran Feng, Xijie Huang, Bohua Chen, Yemin Shi, Chengwei Pan