Paper ID: 2112.02810

An Effective GCN-based Hierarchical Multi-label classification for Protein Function Prediction

Kyudam Choi, Yurim Lee, Cheongwon Kim, Minsung Yoon

We propose an effective method to improve Protein Function Prediction (PFP) utilizing hierarchical features of Gene Ontology (GO) terms. Our method consists of a language model for encoding the protein sequence and a Graph Convolutional Network (GCN) for representing GO terms. To reflect the hierarchical structure of GO to GCN, we employ node(GO term)-wise representations containing the whole hierarchical information. Our algorithm shows effectiveness in a large-scale graph by expanding the GO graph compared to previous models. Experimental results show that our method outperformed state-of-the-art PFP approaches.

Submitted: Dec 6, 2021