Reward Tree

Reward trees are interpretable models used to represent reward functions in reinforcement learning, particularly within the context of learning from human feedback. Current research focuses on developing efficient algorithms, such as Monte Carlo Tree Search and genetic programming, to construct and refine these tree structures, often integrating them with large language models for improved performance and scalability in diverse applications like biomedical knowledge retrieval and traffic signal control. The ability to learn and utilize interpretable reward functions offers significant advantages in ensuring agent alignment, facilitating debugging, and improving the explainability of complex decision-making systems.

Papers