Paper ID: 2208.06436

RandomSCM: interpretable ensembles of sparse classifiers tailored for omics data

Thibaud Godon, Pier-Luc Plante, Baptiste Bauvin, Elina Francovic-Fontaine, Alexandre Drouin, Jacques Corbeil

Background: Understanding the relationship between the Omics and the phenotype is a central problem in precision medicine. The high dimensionality of metabolomics data challenges learning algorithms in terms of scalability and generalization. Most learning algorithms do not produce interpretable models -- Method: We propose an ensemble learning algorithm based on conjunctions or disjunctions of decision rules. -- Results : Applications on metabolomics data shows that it produces models that achieves high predictive performances. The interpretability of the models makes them useful for biomarker discovery and patterns discovery in high dimensional data.

Submitted: Aug 11, 2022