Molecular Embeddings
Molecular embeddings represent chemical structures and properties as numerical vectors, enabling machine learning models to analyze and predict various molecular characteristics. Current research focuses on developing sophisticated embeddings using techniques like graph neural networks, large language models (LLMs), and multimodal approaches that integrate both chemical language and physicochemical features, often aiming for improved efficiency and accuracy in tasks such as similarity searching, property prediction, and drug discovery. These advancements are significantly impacting drug discovery and materials science by accelerating processes like virtual screening and enabling more efficient analysis of large chemical databases. The development of more informative and computationally efficient embeddings remains a key area of ongoing investigation.
Papers
Beyond Chemical Language: A Multimodal Approach to Enhance Molecular Property Prediction
Eduardo Soares, Emilio Vital Brazil, Karen Fiorela Aquino Gutierrez, Renato Cerqueira, Dan Sanders, Kristin Schmidt, Dmitry Zubarev
Otter-Knowledge: benchmarks of multimodal knowledge graph representation learning from different sources for drug discovery
Hoang Thanh Lam, Marco Luca Sbodio, Marcos Martínez Galindo, Mykhaylo Zayats, Raúl Fernández-Díaz, Víctor Valls, Gabriele Picco, Cesar Berrospi Ramis, Vanessa López