Biocreative VII
BioCreative VII encompassed several challenges focused on advancing biomedical natural language processing (NLP), primarily addressing the efficient extraction of information from large text corpora like PubMed and social media. Research heavily utilized transformer-based models, such as BERT and its variants, often employing ensemble methods and data augmentation techniques to improve performance on tasks like multi-label classification of articles (e.g., assigning topics to COVID-19 research papers) and named entity recognition (e.g., identifying medications in tweets). These advancements significantly improve the speed and accuracy of literature curation and knowledge extraction, facilitating faster scientific discovery and more effective public health monitoring.
Papers
Automatic Extraction of Medication Names in Tweets as Named Entity Recognition
Carol Anderson, Bo Liu, Anas Abidin, Hoo-Chang Shin, Virginia Adams
Chemical Identification and Indexing in PubMed Articles via BERT and Text-to-Text Approaches
Virginia Adams, Hoo-Chang Shin, Carol Anderson, Bo Liu, Anas Abidin