Human in the Loop Reinforcement
Human-in-the-loop reinforcement learning (HiRL) aims to improve the efficiency and effectiveness of reinforcement learning (RL) agents by incorporating human expertise and feedback during the training process. Current research focuses on developing methods that reduce human workload, such as preference-based learning and techniques leveraging sub-optimal data or pre-training, often employing algorithms like Proximal Policy Optimization (PPO) and incorporating human input through reward shaping, action injection, or interactive learning. HiRL is significant because it addresses challenges in traditional RL, like reward function design and data scarcity, enabling the development of more robust and adaptable agents for complex real-world applications in diverse fields, including autonomous vehicles, robotics, and particle physics.