Paper ID: 2409.16327

GATher: Graph Attention Based Predictions of Gene-Disease Links

David Narganes-Carlon, Anniek Myatt, Mani Mudaliar, Daniel J. Crowther

Target selection is crucial in pharmaceutical drug discovery, directly influencing clinical trial success. Despite its importance, drug development remains resource-intensive, often taking over a decade with significant financial costs. High failure rates highlight the need for better early-stage target selection. We present GATher, a graph attention network designed to predict therapeutic gene-disease links by integrating data from diverse biomedical sources into a graph with over 4.4 million edges. GATher incorporates GATv3, a novel graph attention convolution layer, and GATv3HeteroConv, which aggregates transformations for each edge type, enhancing its ability to manage complex interactions within this extensive dataset. Utilizing hard negative sampling and multi-task pre-training, GATher addresses topological imbalances and improves specificity. Trained on data up to 2018 and evaluated through 2024, our results show GATher predicts clinical trial outcomes with a ROC AUC of 0.69 for unmet efficacy failures and 0.79 for positive efficacy. Feature attribution methods, using Captum, highlight key nodes and relationships, enhancing model interpretability. By 2024, GATher improved precision in prioritizing the top 200 clinical trial targets to 14.1%, an absolute increase of over 3.5% compared to other methods. GATher outperforms existing models like GAT, GATv2, and HGT in predicting clinical trial outcomes, demonstrating its potential in enhancing target validation and predicting clinical efficacy and safety.

Submitted: Sep 23, 2024