Transferable Reward Learning by Dynamics-Agnostic Discriminator Ensemble [2206.00238]