Paper ID: 2404.04857

Learning Adaptive Multi-Objective Robot Navigation Incorporating Demonstrations

Jorge de Heuvel, Tharun Sethuraman, Maren Bennewitz

Preference-aligned robot navigation in human environments is typically achieved through learning-based approaches, utilizing user feedback or demonstrations for personalization. However, personal preferences are subject to change and might even be context-dependent. Yet traditional reinforcement learning (RL) approaches with static reward functions often fall short in adapting to these varying user preferences, inevitably reflecting demonstrations once training is completed. This paper introduces a framework that combines multi-objective reinforcement learning (MORL) with demonstration-based learning. Our approach allows for dynamic adaptation to changing user preferences without retraining. It fluently modulates between reward-defined preference objectives and the amount of demonstration data reflection. Through rigorous evaluations, including a sim-to-real transfer on two robots, we demonstrate our framework's capability to reflect user preferences accurately while achieving high navigational performance in terms of collision avoidance and goal pursuance.

Submitted: Apr 7, 2024