Paper ID: 2309.15039

Can-SAVE: Mass Cancer Risk Prediction via Survival Analysis Variables and EHR

Petr Philonenko, Vladimir Kokh, Pavel Blinov

Specific medical cancer screening methods are often costly, time-consuming, and weakly applicable on a large scale. Advanced Artificial Intelligence (AI) methods greatly help cancer detection but require specific or deep medical data. These aspects prevent the mass implementation of cancer screening methods. For this reason, it is a disruptive change for healthcare to apply AI methods for mass personalized assessment of the cancer risk among patients based on the existing Electronic Health Records (EHR) volume. This paper presents a novel Can-SAVE cancer risk assessment method combining a survival analysis approach with a gradient-boosting algorithm. It is highly accessible and resource-efficient, utilizing only a sequence of high-level medical events. We tested the proposed method in a long-term retrospective experiment covering more than 1.1 million people and four regions of Russia. The Can-SAVE method significantly exceeds the baselines by the Average Precision metric of 22.8%$\pm$2.7% vs 15.1%$\pm$2.6%. The extensive ablation study also confirmed the proposed method's dominant performance. The experiment supervised by oncologists shows a reliable cancer patient detection rate of up to 84 out of 1000 selected. Such results surpass the medical screening strategies estimates; the typical age-specific Number Needed to Screen is only 9 out of 1000 (for colorectal cancer). Overall, our experiments show a 4.7-6.4 times improvement in cancer detection rate (TOP@1k) compared to the traditional healthcare risk estimation approach.

Submitted: Sep 26, 2023