Text to Image Diffusion Model
Text-to-image diffusion models generate images from textual descriptions, aiming for high-fidelity and precise alignment. Current research focuses on improving controllability, addressing safety concerns (e.g., preventing generation of inappropriate content), and enhancing personalization capabilities through techniques like continual learning and latent space manipulation. These advancements are significant for various applications, including medical imaging, artistic creation, and data augmentation, while also raising important ethical considerations regarding model safety and bias.
Papers
MetricGold: Leveraging Text-To-Image Latent Diffusion Models for Metric Depth Estimation
Ansh Shah, K Madhava Krishna
Test-time Conditional Text-to-Image Synthesis Using Diffusion Models
Tripti Shukla, Srikrishna Karanam, Balaji Vasan Srinivasan
Bag of Design Choices for Inference of High-Resolution Masked Generative Transformer
Shitong Shao, Zikai Zhou, Tian Ye, Lichen Bai, Zhiqiang Xu, Zeke Xie
TDSM:Triplet Diffusion for Skeleton-Text Matching in Zero-Shot Action Recognition
Jeonghyeok Do, Munchurl Kim
MaskMedPaint: Masked Medical Image Inpainting with Diffusion Models for Mitigation of Spurious Correlations
Qixuan Jin, Walter Gerych, Marzyeh Ghassemi
Evaluating Text-to-Image Diffusion Models for Texturing Synthetic Data
Thomas Lips, Francis wyffels
EyeDiff: text-to-image diffusion model improves rare eye disease diagnosis
Ruoyu Chen, Weiyi Zhang, Bowen Liu, Xiaolan Chen, Pusheng Xu, Shunming Liu, Mingguang He, Danli Shi
Boundary Attention Constrained Zero-Shot Layout-To-Image Generation
Huancheng Chen, Jingtao Li, Weiming Zhuang, Haris Vikalo, Lingjuan Lyu