Paper ID: 2401.12232

Machine Learning Modeling Of SiRNA Structure-Potency Relationship With Applications Against Sars-Cov-2 Spike Gene

Damilola Oshunyinka

The pharmaceutical Research and development (R&D) process is lengthy and costly, taking nearly a decade to bring a new drug to the market. However, advancements in biotechnology, computational methods, and machine learning algorithms have the potential to revolutionize drug discovery, speeding up the process and improving patient outcomes. The COVID-19 pandemic has further accelerated and deepened the recognition of the potential of these techniques, especially in the areas of drug repurposing and efficacy predictions. Meanwhile, non-small molecule therapeutic modalities such as cell therapies, monoclonal antibodies, and RNA interference (RNAi) technology have gained importance due to their ability to target specific disease pathways and/or patient populations. In the field of RNAi, many experiments have been carried out to design and select highly efficient siRNAs. However, the established patterns for efficient siRNAs are sometimes contradictory and unable to consistently determine the most potent siRNA molecules against a target mRNA. Thus, this paper focuses on developing machine learning models based on the cheminformatics representation of the nucleotide composition (i.e. AUTGC) of siRNA to predict their potency and aid the selection of the most efficient siRNAs for further development. The PLS (Partial Least Square) and SVR (Support Vector Regression) machine learning models built in this work outperformed previously published models. These models can help in predicting siRNA potency and aid in selecting the best siRNA molecules for experimental validation and further clinical development. The study has demonstrated the potential of AI/machine learning models to help expedite siRNA-based drug discovery including the discovery of potent siRNAs against SARS-CoV-2.

Submitted: Jan 18, 2024