Molecular Representation
Molecular representation focuses on encoding the complex information within molecules into formats suitable for machine learning, aiming to improve predictions of molecular properties and facilitate drug discovery and materials science. Current research emphasizes multimodal approaches, integrating various data types like molecular graphs, SMILES strings, and textual descriptions, often leveraging graph neural networks (GNNs), transformers, and contrastive learning methods. These advancements enable more accurate and efficient prediction of molecular properties, accelerating the design and development of new molecules with desired characteristics. The resulting improvements in molecular understanding have significant implications for diverse fields, including drug discovery, materials science, and environmental chemistry.
Papers
BioT5: Enriching Cross-modal Integration in Biology with Chemical Knowledge and Natural Language Associations
Qizhi Pei, Wei Zhang, Jinhua Zhu, Kehan Wu, Kaiyuan Gao, Lijun Wu, Yingce Xia, Rui Yan
ProFSA: Self-supervised Pocket Pretraining via Protein Fragment-Surroundings Alignment
Bowen Gao, Yinjun Jia, Yuanle Mo, Yuyan Ni, Weiying Ma, Zhiming Ma, Yanyan Lan
ADMET property prediction through combinations of molecular fingerprints
James H. Notwell, Michael W. Wood
Learning Over Molecular Conformer Ensembles: Datasets and Benchmarks
Yanqiao Zhu, Jeehyun Hwang, Keir Adams, Zhen Liu, Bozhao Nan, Brock Stenfors, Yuanqi Du, Jatin Chauhan, Olaf Wiest, Olexandr Isayev, Connor W. Coley, Yizhou Sun, Wei Wang