Text to Image Diffusion Model
Text-to-image diffusion models generate images from textual descriptions, aiming for high-fidelity and precise alignment. Current research focuses on improving controllability, addressing safety concerns (e.g., preventing generation of inappropriate content), and enhancing personalization capabilities through techniques like continual learning and latent space manipulation. These advancements are significant for various applications, including medical imaging, artistic creation, and data augmentation, while also raising important ethical considerations regarding model safety and bias.
Papers
DPMesh: Exploiting Diffusion Prior for Occluded Human Mesh Recovery
Yixuan Zhu, Ao Li, Yansong Tang, Wenliang Zhao, Jie Zhou, Jiwen Lu
Bigger is not Always Better: Scaling Properties of Latent Diffusion Models
Kangfu Mei, Zhengzhong Tu, Mauricio Delbracio, Hossein Talebi, Vishal M. Patel, Peyman Milanfar
CosmicMan: A Text-to-Image Foundation Model for Humans
Shikai Li, Jianglin Fu, Kaiyuan Liu, Wentao Wang, Kwan-Yee Lin, Wayne Wu
Uncovering the Text Embedding in Text-to-Image Diffusion Models
Hu Yu, Hao Luo, Fan Wang, Feng Zhao
Detecting Image Attribution for Text-to-Image Diffusion Models in RGB and Beyond
Katherine Xu, Lingzhi Zhang, Jianbo Shi
GANTASTIC: GAN-based Transfer of Interpretable Directions for Disentangled Image Editing in Text-to-Image Diffusion Models
Yusuf Dalva, Hidir Yesiltepe, Pinar Yanardag
MIST: Mitigating Intersectional Bias with Disentangled Cross-Attention Editing in Text-to-Image Diffusion Models
Hidir Yesiltepe, Kiymet Akdemir, Pinar Yanardag
TextCraftor: Your Text Encoder Can be Image Quality Controller
Yanyu Li, Xian Liu, Anil Kag, Ju Hu, Yerlan Idelbayev, Dhritiman Sagar, Yanzhi Wang, Sergey Tulyakov, Jian Ren
ECNet: Effective Controllable Text-to-Image Diffusion Models
Sicheng Li, Keqiang Sun, Zhixin Lai, Xiaoshi Wu, Feng Qiu, Haoran Xie, Kazunori Miyata, Hongsheng Li
Ship in Sight: Diffusion Models for Ship-Image Super Resolution
Luigi Sigillo, Riccardo Fosco Gramaccioni, Alessandro Nicolosi, Danilo Comminiello
Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation
Omer Dahary, Or Patashnik, Kfir Aberman, Daniel Cohen-Or
Isolated Diffusion: Optimizing Multi-Concept Text-to-Image Generation Training-Freely with Isolated Diffusion Guidance
Jingyuan Zhu, Huimin Ma, Jiansheng Chen, Jian Yuan
FlashEval: Towards Fast and Accurate Evaluation of Text-to-image Diffusion Generative Models
Lin Zhao, Tianchen Zhao, Zinan Lin, Xuefei Ning, Guohao Dai, Huazhong Yang, Yu Wang
DP-RDM: Adapting Diffusion Models to Private Domains Without Fine-Tuning
Jonathan Lebensold, Maziar Sanjabi, Pietro Astolfi, Adriana Romero-Soriano, Kamalika Chaudhuri, Mike Rabbat, Chuan Guo
Open-Vocabulary Attention Maps with Token Optimization for Semantic Segmentation in Diffusion Models
Pablo Marcos-Manchón, Roberto Alcover-Couso, Juan C. SanMiguel, Jose M. Martínez