Paper ID: 2208.09944

MolGraph: a Python package for the implementation of molecular graphs and graph neural networks with TensorFlow and Keras

Alexander Kensert, Gert Desmet, Deirdre Cabooter

Molecular machine learning (ML) has proven important for tackling various molecular problems, such as predicting molecular properties based on molecular descriptors or fingerprints. Since relatively recently, graph neural network (GNN) algorithms have been implemented for molecular ML, showing comparable or superior performance to descriptor or fingerprint-based approaches. Although various tools and packages exist to apply GNNs in molecular ML, a new GNN package, named MolGraph, was developed in this work with the motivation to create GNN model pipelines highly compatible with the TensorFlow and Keras application programming interface (API). MolGraph also implements a chemistry module to accommodate the generation of small molecular graphs, which can be passed to a GNN algorithm to solve a molecular ML problem. To validate the GNNs, they were benchmarked against the datasets of MoleculeNet, as well as three chromatographic retention time datasets. The results on these benchmarks illustrate that the GNNs performed as expected. Additionally, the GNNs proved useful for molecular identification and improved interpretability of chromatographic retention time data. MolGraph is available at https://github.com/akensert/molgraph. Installation, tutorials and implementation details can be found at https://molgraph.readthedocs.io/en/latest/.

Submitted: Aug 21, 2022