Paper ID: 2409.13743

Effect of Clinical History on Predictive Model Performance for Renal Complications of Diabetes

Davide Dei Cas, Barbara Di Camillo, Gian Paolo Fadini, Giovanni Sparacino, Enrico Longato

Diabetes is a chronic disease characterised by a high risk of developing diabetic nephropathy, which, in turn, is the leading cause of end-stage chronic kidney disease. The early identification of individuals at heightened risk of such complications or their exacerbation can be of paramount importance to set a correct course of treatment. In the present work, from the data collected in the DARWIN-Renal (DApagliflozin Real-World evIdeNce-Renal) study, a nationwide multicentre retrospective real-world study, we develop an array of logistic regression models to predict, over different prediction horizons, the crossing of clinically relevant glomerular filtration rate (eGFR) thresholds for patients with diabetes by means of variables associated with demographic, anthropometric, laboratory, pathology, and therapeutic data. In doing so, we investigate the impact of information coming from patient's past visits on the model's predictive performance, coupled with an analysis of feature importance through the Boruta algorithm. Our models yield very good performance (AUROC as high as 0.98). We also show that the introduction of information from patient's past visits leads to improved model performance of up to 4%. The usefulness of past information is further corroborated by a feature importance analysis.

Submitted: Sep 10, 2024