Paper ID: 2208.04112

A review on longitudinal data analysis with random forest in precision medicine

Jianchang Hu, Silke Szymczak

Precision medicine provides customized treatments to patients based on their characteristics and is a promising approach to improving treatment efficiency. Large scale omics data are useful for patient characterization, but often their measurements change over time, leading to longitudinal data. Random forest is one of the state-of-the-art machine learning methods for building prediction models, and can play a crucial role in precision medicine. In this paper, we review extensions of the standard random forest method for the purpose of longitudinal data analysis. Extension methods are categorized according to the data structures for which they are designed. We consider both univariate and multivariate responses and further categorize the repeated measurements according to whether the time effect is relevant. Information of available software implementations of the reviewed extensions is also given. We conclude with discussions on the limitations of our review and some future research directions.

Submitted: Aug 8, 2022