Paper ID: 2203.11449

How well does CLIP understand texture?

Chenyun Wu, Subhransu Maji

We investigate how well CLIP understands texture in natural images described by natural language. To this end, we analyze CLIP's ability to: (1) perform zero-shot learning on various texture and material classification datasets; (2) represent compositional properties of texture such as red dots or yellow stripes on the Describable Texture in Detail(DTDD) dataset; and (3) aid fine-grained categorization of birds in photographs described by color and texture of their body parts.

Submitted: Mar 22, 2022