Learning mirror maps in policy mirror descent [2402.05187]