Learning Tri-modal Embeddings for Zero-Shot Soundscape Mapping [2309.10667]