Text to Image Diffusion Model
Text-to-image diffusion models generate images from textual descriptions, aiming for high-fidelity and precise alignment. Current research focuses on improving controllability, addressing safety concerns (e.g., preventing generation of inappropriate content), and enhancing personalization capabilities through techniques like continual learning and latent space manipulation. These advancements are significant for various applications, including medical imaging, artistic creation, and data augmentation, while also raising important ethical considerations regarding model safety and bias.
Papers
Readout Guidance: Learning Control from Diffusion Features
Grace Luo, Trevor Darrell, Oliver Wang, Dan B Goldman, Aleksander Holynski
X-Adapter: Adding Universal Compatibility of Plugins for Upgraded Diffusion Model
Lingmin Ran, Xiaodong Cun, Jia-Wei Liu, Rui Zhao, Song Zijie, Xintao Wang, Jussi Keppo, Mike Zheng Shou
GraphDreamer: Compositional 3D Scene Synthesis from Scene Graphs
Gege Gao, Weiyang Liu, Anpei Chen, Andreas Geiger, Bernhard Schölkopf
ElasticDiffusion: Training-free Arbitrary Size Image Generation through Global-Local Content Separation
Moayed Haji-Ali, Guha Balakrishnan, Vicente Ordonez
SocialCounterfactuals: Probing and Mitigating Intersectional Social Biases in Vision-Language Models with Counterfactual Examples
Phillip Howard, Avinash Madasu, Tiep Le, Gustavo Lujan Moreno, Anahita Bhiwandiwalla, Vasudev Lal
Continual Diffusion with STAMINA: STack-And-Mask INcremental Adapters
James Seale Smith, Yen-Chang Hsu, Zsolt Kira, Yilin Shen, Hongxia Jin
Contrastive Denoising Score for Text-guided Latent Diffusion Image Editing
Hyelin Nam, Gihyun Kwon, Geon Yeong Park, Jong Chul Ye
HiFi Tuner: High-Fidelity Subject-Driven Fine-Tuning for Diffusion Models
Zhonghao Wang, Wei Wei, Yang Zhao, Zhisheng Xiao, Mark Hasegawa-Johnson, Humphrey Shi, Tingbo Hou
HiPA: Enabling One-Step Text-to-Image Diffusion Models via High-Frequency-Promoting Adaptation
Yifan Zhang, Bryan Hooi
Unsupervised Keypoints from Pretrained Diffusion Models
Eric Hedlin, Gopal Sharma, Shweta Mahajan, Xingzhe He, Hossam Isack, Abhishek Kar Helge Rhodin, Andrea Tagliasacchi, Kwang Moo Yi
Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models
Daniel Geng, Inbum Park, Andrew Owens
Receler: Reliable Concept Erasing of Text-to-Image Diffusion Models via Lightweight Erasers
Chi-Pin Huang, Kai-Po Chang, Chung-Ting Tsai, Yung-Hsuan Lai, Fu-En Yang, Yu-Chiang Frank Wang
Fair Text-to-Image Diffusion via Fair Mapping
Jia Li, Lijie Hu, Jingfeng Zhang, Tianhang Zheng, Hua Zhang, Di Wang
AnyLens: A Generative Diffusion Model with Any Rendering Lens
Andrey Voynov, Amir Hertz, Moab Arar, Shlomi Fruchter, Daniel Cohen-Or
When StyleGAN Meets Stable Diffusion: a $\mathscr{W}_+$ Adapter for Personalized Image Generation
Xiaoming Li, Xinyu Hou, Chen Change Loy
Unlocking Spatial Comprehension in Text-to-Image Diffusion Models
Mohammad Mahdi Derakhshani, Menglin Xia, Harkirat Behl, Cees G. M. Snoek, Victor Rühle
LEDITS++: Limitless Image Editing using Text-to-Image Models
Manuel Brack, Felix Friedrich, Katharina Kornmeier, Linoy Tsaban, Patrick Schramowski, Kristian Kersting, Apolinário Passos