Latent Diffusion Model
Latent diffusion models (LDMs) are generative AI models that create high-quality images by reversing a diffusion process in a compressed latent space, offering efficiency advantages over pixel-space methods. Current research focuses on improving controllability (e.g., through text or other modalities), enhancing efficiency (e.g., via parameter-efficient architectures or faster inference), and addressing challenges like model robustness and ethical concerns (e.g., watermarking and mitigating adversarial attacks). LDMs are significantly impacting various fields, including medical imaging (synthesis and restoration), speech enhancement, and even physics simulation, by enabling the generation of realistic and diverse data for training and analysis where real data is scarce or difficult to obtain.
Papers
Guess What I Think: Streamlined EEG-to-Image Generation with Latent Diffusion Models
Eleonora Lopez, Luigi Sigillo, Federica Colonnese, Massimo Panella, Danilo Comminiello
High-Resolution Speech Restoration with Latent Diffusion Model
Tushar Dhyani, Florian Lux, Michele Mancusi, Giorgio Fabbro, Fritz Hohl, Ngoc Thang Vu
Towards Effective User Attribution for Latent Diffusion Models via Watermark-Informed Blending
Yongyang Pan, Xiaohong Liu, Siqi Luo, Yi Xin, Xiao Guo, Xiaoming Liu, Xiongkuo Min, Guangtao Zhai