Paper ID: 2406.00183

Predicting solvation free energies with an implicit solvent machine learning potential

Sebastien Röcken, Anton F. Burnet, Julija Zavadlav

Machine learning (ML) potentials are a powerful tool in molecular modeling, enabling ab initio accuracy for comparably small computational costs. Nevertheless, all-atom simulations employing best-performing graph neural network architectures are still too expensive for applications requiring extensive sampling, such as free energy computations. Implicit solvent models could provide the necessary speed-up due to reduced degrees of freedom and faster dynamics. Here, we introduce a Solvation Free Energy Path Reweighting (ReSolv) framework to parametrize an implicit solvent ML potential for small organic molecules that accurately predicts the hydration free energy, an essential parameter in drug design and pollutant modeling. With a combination of top-down (experimental hydration free energy data) and bottom-up (ab initio data of molecules in a vacuum) learning, ReSolv bypasses the need for intractable ab initio data of molecules in explicit bulk solvent and does not have to resort to less accurate data-generating models. On the FreeSolv dataset, ReSolv achieves a mean absolute error close to average experimental uncertainty, significantly outperforming standard explicit solvent force fields. The presented framework paves the way toward deep molecular models that are more accurate yet computationally cheaper than classical atomistic models.

Submitted: May 31, 2024