Noise-Robust Voice Conversion by Conditional Denoising Training Using Latent Variables of Recording Quality and Environment [2406.07280]