Chemical Structure Recognition
Chemical structure recognition (CSR) aims to automatically translate images of molecules into machine-readable formats like SMILES strings, crucial for accelerating drug discovery and chemical information processing. Current research heavily utilizes deep learning, particularly convolutional neural networks (CNNs) and transformer architectures, often combined to leverage both local and global image features, with a focus on improving robustness to diverse drawing styles and handling complex structures like Markush structures. Advances in CSR are significantly impacting cheminformatics by enabling efficient data extraction from vast chemical literature archives and facilitating automated analysis of chemical reactions and properties.
Papers
RFL: Simplifying Chemical Structure Recognition with Ring-Free Language
Qikai Chang, Mingjun Chen, Changpeng Pi, Pengfei Hu, Zhenrong Zhang, Jiefeng Ma, Jun Du, Baocai Yin, Jinshui Hu
Intelligent System for Automated Molecular Patent Infringement Assessment
Yaorui Shi, Sihang Li, Taiyan Zhang, Xi Fang, Jiankun Wang, Zhiyuan Liu, Guojiang Zhao, Zhengdan Zhu, Zhifeng Gao, Renxin Zhong, Linfeng Zhang, Guolin Ke, Weinan E, Hengxing Cai, Xiang Wang