Paper ID: 2312.10707

CLDR: Contrastive Learning Drug Response Models from Natural Language Supervision

Kun Li, Wenbin Hu

Deep learning-based drug response prediction (DRP) methods can accelerate the drug discovery process and reduce R\&D costs. Although the mainstream methods achieve high accuracy in predicting response regression values, the regression-aware representations of these methods are fragmented and fail to capture the continuity of the sample order. This phenomenon leads to models optimized to sub-optimal solution spaces, reducing generalization ability and may result in significant wasted costs in the drug discovery phase. In this paper, we propose \MN, a contrastive learning framework with natural language supervision for the DRP. The \MN~converts regression labels into text, which is merged with the captions text of the drug response as a second modality of the samples compared to the traditional modalities (graph, sequence). In each batch, two modalities of one sample are considered positive pairs and the other pairs are considered negative pairs. At the same time, in order to enhance the continuous representation capability of the numerical text, a common-sense numerical knowledge graph is introduced. We validated several hundred thousand samples from the Genomics of Drug Sensitivity in Cancer dataset, observing the average improvement of the DRP method ranges from 7.8\% to 31.4\% with the application of our framework. The experiments prove that the \MN~effectively constrains the samples to a continuous distribution in the representation space, and achieves impressive prediction performance with only a few epochs of fine-tuning after pre-training. The code is available at: \url{https://gitee.com/xiaoyibang/clipdrug.git}.

Submitted: Dec 17, 2023