Diffusion Alignment

Diffusion alignment focuses on aligning the outputs of diffusion models, powerful generative models, with desired properties or target data, often using reinforcement learning or preference-based optimization. Current research emphasizes efficient alignment techniques, exploring methods like Direct Preference Optimization (DPO) and leveraging pre-trained reward models or Q-functions to guide the alignment process, often within specific architectures such as generative flow networks or diffusion autoencoders. This work is significant because it enables better control and adaptation of diffusion models for various applications, including image and video generation, continuous control, and multimodal data integration, improving the quality and relevance of generated outputs.

Papers