Paper ID: 2211.11225
TimbreCLIP: Connecting Timbre to Text and Images
Nicolas Jonason, Bob L. T. Sturm
We present work in progress on TimbreCLIP, an audio-text cross modal embedding trained on single instrument notes. We evaluate the models with a cross-modal retrieval task on synth patches. Finally, we demonstrate the application of TimbreCLIP on two tasks: text-driven audio equalization and timbre to image generation.
Submitted: Nov 21, 2022