Text to Image Diffusion Model
Text-to-image diffusion models generate images from textual descriptions, aiming for high-fidelity and precise alignment. Current research focuses on improving controllability, addressing safety concerns (e.g., preventing generation of inappropriate content), and enhancing personalization capabilities through techniques like continual learning and latent space manipulation. These advancements are significant for various applications, including medical imaging, artistic creation, and data augmentation, while also raising important ethical considerations regarding model safety and bias.
Papers
VividDreamer: Towards High-Fidelity and Efficient Text-to-3D Generation
Zixuan Chen, Ruijie Su, Jiahao Zhu, Lingxiao Yang, Jian-Huang Lai, Xiaohua Xie
Six-CD: Benchmarking Concept Removals for Benign Text-to-image Diffusion Models
Jie Ren, Kangrui Chen, Yingqian Cui, Shenglai Zeng, Hui Liu, Yue Xing, Jiliang Tang, Lingjuan Lyu
Stylebreeder: Exploring and Democratizing Artistic Styles through Text-to-Image Models
Matthew Zheng, Enis Simsar, Hidir Yesiltepe, Federico Tombari, Joel Simon, Pinar Yanardag
A Survey of Multimodal-Guided Image Editing with Text-to-Image Diffusion Models
Xincheng Shuai, Henghui Ding, Xingjun Ma, Rongcheng Tu, Yu-Gang Jiang, Dacheng Tao
Advancing Fine-Grained Classification by Structure and Subject Preserving Augmentation
Eyal Michaeli, Ohad Fried
Invertible Consistency Distillation for Text-Guided Image Editing in Around 7 Steps
Nikita Starodubcev, Mikhail Khoroshikh, Artem Babenko, Dmitry Baranchuk
Make It Count: Text-to-Image Generation with an Accurate Number of Objects
Lital Binyamin, Yoad Tewel, Hilit Segev, Eran Hirsch, Royi Rassin, Gal Chechik
OrientDream: Streamlining Text-to-3D Generation with Explicit Orientation Control
Yuzhong Huang, Zhong Li, Zhang Chen, Zhiyuan Ren, Guosheng Lin, Fred Morstatter, Yi Xu
Neural Assets: 3D-Aware Multi-Object Scene Synthesis with Image Diffusion Models
Ziyi Wu, Yulia Rubanova, Rishabh Kabra, Drew A. Hudson, Igor Gilitschenski, Yusuf Aytar, Sjoerd van Steenkiste, Kelsey R. Allen, Thomas Kipf
EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts
Yucheng Han, Rui Wang, Chi Zhang, Juntao Hu, Pei Cheng, Bin Fu, Hanwang Zhang