the latest in aiBeta

Entropy Reward

Entropy reward methods aim to improve the performance and robustness of reinforcement learning (RL) and generative models by incorporating entropy as a reward signal or regularizer. Current research focuses on addressing challenges like "reward collapse" in diffusion models and promoting predictable, interpretable behavior in RL agents through entropy rate minimization, often employing techniques like soft actor-critic (SAC) and population-based training. These approaches are significant because they enhance exploration, mitigate overfitting to imperfect reward functions, and lead to more diverse and reliable model outputs, with applications ranging from image generation to human-AI collaboration.

5papers

Papers

March 17, 2025

High-entropy Advantage in Neural Networks' Generalizability
Landau Equation Entropy Reward Neural Network Stronger Generalizability Entropy Solution

December 25, 2024

Generative Models with ELBOs Converging to Entropy Sums
Unsupervised Probabilistic Model Entropy Reward Generative Model Evidence Lower Bound Variational Distribution

February 23, 2024

Fine-Tuning of Continuous-Time Diffusion Models as Entropy-Regularized Control
Diffusion Model Reward Collapse Policy Entropy Fine Tuning Entropy Reward

November 30, 2023

Predictable Reinforcement Learning Dynamics through Entropy Rate Minimization
Reinforcement Learning Path Entropy Entropy Reward

September 7, 2022

On the Convergence of the ELBO to Entropy Sums
Variational Method Entropy Reward Stable Entropy Variational Distribution Early Stage Convergence Generative Model

January 28, 2022

Do You Need the Entropy Reward (in Practice)?
Entropy Regularization Maximum Entropy Entropy Reward Practice Mode Intrinsic Reward

December 22, 2021

Maximum Entropy Population-Based Training for Zero-Shot Human-AI Coordination
Maximum Entropy Self Play Reinforcement Learning Human AI Coordination Entropy Reward Population Based Training Zero Shot