Upmixing via style transfer: a variational autoencoder for disentangling spatial images and musical content [2203.12053]