Image Based Reinforcement Learning

Image-based reinforcement learning (RL) aims to train agents to make decisions directly from visual input, mimicking human learning from experience. Current research focuses on improving data efficiency and generalization by developing methods for learning disentangled representations, selectively attending to salient image features (often using Vision Transformers or convolutional networks), and pre-training models on auxiliary tasks to improve robustness to environmental variations. These advancements are crucial for enabling RL agents to perform complex tasks in real-world scenarios, particularly in robotics and autonomous systems, where direct visual perception is essential.

Papers