Human Annotation
Human annotation, the process of labeling data for machine learning, is crucial but expensive and time-consuming. Current research focuses on mitigating this bottleneck through techniques like active learning, which prioritizes the most informative data points for human labeling, and the integration of large language models (LLMs) to automate or assist in the annotation process, including generating synthetic data or pre-annotating samples. These advancements aim to improve the efficiency and scalability of data annotation, ultimately accelerating the development and deployment of AI models across various domains, from natural language processing to medical image analysis. The resulting improvements in data quality and reduced annotation costs have significant implications for the broader AI research community and numerous practical applications.
Papers
BioMNER: A Dataset for Biomedical Method Entity Recognition
Chen Tang, Bohao Yang, Kun Zhao, Bo Lv, Chenghao Xiao, Frank Guerin, Chenghua Lin
Mining Reasons For And Against Vaccination From Unstructured Data Using Nichesourcing and AI Data Augmentation
Damián Ariel Furman, Juan Junqueras, Z. Burçe Gümüslü, Edgar Altszyler, Joaquin Navajas, Ophelia Deroy, Justin Sulik