Paper ID: 2204.06076
Hybrid Feature- and Similarity-Based Models for Joint Prediction and Interpretation
Jacqueline K. Kueper, Jennifer Rayner, Daniel J. Lizotte
Electronic health records (EHRs) include simple features like patient age together with more complex data like care history that are informative but not easily represented as individual features. To better harness such data, we developed an interpretable hybrid feature- and similarity-based model for supervised learning that combines feature and kernel learning for prediction and for investigation of causal relationships. We fit our hybrid models by convex optimization with a sparsity-inducing penalty on the kernel. Depending on the desired model interpretation, the feature and kernel coefficients can be learned sequentially or simultaneously. The hybrid models showed comparable or better predictive performance than solely feature- or similarity-based approaches in a simulation study and in a case study to predict two-year risk of loneliness or social isolation with EHR data from a complex primary health care population. Using the case study we also present new kernels for high-dimensional indicator-coded EHR data that are based on deviations from population-level expectations, and we identify considerations for causal interpretations.
Submitted: Apr 12, 2022