Paper ID: 2409.18768 • Published Sep 27, 2024
Learning from Demonstration with Implicit Nonlinear Dynamics Models
Peter David Fagan, Subramanian Ramamoorthy
TL;DR
Get AI-generated summaries with premium
Get AI-generated summaries with premium
Learning from Demonstration (LfD) is a useful paradigm for training policies
that solve tasks involving complex motions, such as those encountered in
robotic manipulation. In practice, the successful application of LfD requires
overcoming error accumulation during policy execution, i.e. the problem of
drift due to errors compounding over time and the consequent
out-of-distribution behaviours. Existing works seek to address this problem
through scaling data collection, correcting policy errors with a
human-in-the-loop, temporally ensembling policy predictions or through learning
a dynamical system model with convergence guarantees. In this work, we propose
and validate an alternative approach to overcoming this issue. Inspired by
reservoir computing, we develop a recurrent neural network layer that includes
a fixed nonlinear dynamical system with tunable dynamical properties for
modelling temporal dynamics. We validate the efficacy of our neural network
layer on the task of reproducing human handwriting motions using the LASA Human
Handwriting Dataset. Through empirical experiments we demonstrate that
incorporating our layer into existing neural network architectures addresses
the issue of compounding errors in LfD. Furthermore, we perform a comparative
evaluation against existing approaches including a temporal ensemble of policy
predictions and an Echo State Network (ESN) implementation. We find that our
approach yields greater policy precision and robustness on the handwriting task
while also generalising to multiple dynamics regimes and maintaining
competitive latency scores.