Paper ID: 2408.04651

Knowledge AI: Fine-tuning NLP Models for Facilitating Scientific Knowledge Extraction and Understanding

Balaji Muralidharan, Hayden Beadles, Reza Marzban, Kalyan Sashank Mupparaju

This project investigates the efficacy of Large Language Models (LLMs) in understanding and extracting scientific knowledge across specific domains and to create a deep learning framework: Knowledge AI. As a part of this framework, we employ pre-trained models and fine-tune them on datasets in the scientific domain. The models are adapted for four key Natural Language Processing (NLP) tasks: summarization, text generation, question answering, and named entity recognition. Our results indicate that domain-specific fine-tuning significantly enhances model performance in each of these tasks, thereby improving their applicability for scientific contexts. This adaptation enables non-experts to efficiently query and extract information within targeted scientific fields, demonstrating the potential of fine-tuned LLMs as a tool for knowledge discovery in the sciences.

Submitted: Aug 4, 2024