Reward Gradient

Reward gradient methods aim to improve the performance of generative models, particularly diffusion models, by optimizing them towards desired outcomes defined by reward functions. Current research focuses on efficiently backpropagating reward gradients through the model's generation process, exploring techniques like low-rank adapters and gradient checkpointing to manage computational demands, and comparing gradient-based approaches to gradient-free alternatives such as evolutionary strategies. This research is significant because it enables more effective fine-tuning of generative models for specific tasks, improving their controllability and alignment with human preferences across diverse applications, including video generation, autonomous driving, and robotics.

Papers