Paper ID: 2405.15882

Risk Factor Identification In Osteoporosis Using Unsupervised Machine Learning Techniques

Mikayla Calitis

In this study, the reliability of identified risk factors associated with osteoporosis is investigated using a new clustering-based method on electronic medical records. This study proposes utilizing a new CLustering Iterations Framework (CLIF) that includes an iterative clustering framework that can adapt any of the following three components: clustering, feature selection, and principal feature identification. The study proposes using Wasserstein distance to identify principal features, borrowing concepts from the optimal transport theory. The study also suggests using a combination of ANOVA and ablation tests to select influential features from a data set. Some risk factors presented in existing works are endorsed by our identified significant clusters, while the reliability of some other risk factors is weakened.

Submitted: May 24, 2024