Dataset Mention
Dataset mention, encompassing the identification and analysis of datasets within scientific literature and other contexts, aims to improve data discoverability and reproducibility. Current research focuses on automated dataset mention extraction using techniques like Bi-LSTM-CRF neural networks and large language models (LLMs) to enhance metadata and facilitate data linking across different sources. This work is crucial for tracking dataset usage, improving data quality assessment (including label design and class balance), and ultimately accelerating scientific progress by fostering better data management and reuse.
Papers
July 26, 2024
May 21, 2024
November 20, 2023
November 7, 2023
August 11, 2023
May 19, 2023
April 17, 2023
March 28, 2023