Human Annotation

Human annotation, the process of labeling data for machine learning, is crucial but expensive and time-consuming. Current research focuses on mitigating this bottleneck through techniques like active learning, which prioritizes the most informative data points for human labeling, and the integration of large language models (LLMs) to automate or assist in the annotation process, including generating synthetic data or pre-annotating samples. These advancements aim to improve the efficiency and scalability of data annotation, ultimately accelerating the development and deployment of AI models across various domains, from natural language processing to medical image analysis. The resulting improvements in data quality and reduced annotation costs have significant implications for the broader AI research community and numerous practical applications.

132papers

Papers - Page 2

February 14, 2025

Accelerating Unbiased LLM Evaluation via Synthetic Feedback
Zhaoyi Zhou, Yuda Song, Andrea Zanette
Large Language Model Synthetic Feedback Human Feedback Human Annotation

February 10, 2025

InSTA: Towards Internet-Scale Training For Agents
Brandon Trabucco, Gunnar Sigurdsson, Robinson Piramuthu, Ruslan Salakhutdinov
Agent Smith Human Annotation Web Agent Large Scale Training Language Model

January 28, 2025

Experimenting with Affective Computing Models in Video Interviews with Spanish-speaking Older Adults
Josep Lopez Camunas, Cristina Bustos, Yanjun Zhu, Raquel Ros, Agata Lapedriza
Facial Expression Affective Response Older Adult Human Annotation Interview Data

January 14, 2025

Ensemble of Large Language Models for Curated Labeling and Rating of Free-text Data
Jiaxing Qiu, Dongliang Guo, Natalie Papini, Noelle Peace, Cheri A. Levinson, Teague R. Henry
Text Data Diverse Ensemble Relevance Ranking Human Generated Label Large Language Model Human Rating Human Annotation

December 24, 2024

December 20, 2024

December 19, 2024

From Human Annotation to LLMs: SILICON Annotation Workflow for Management Research
Xiang Cheng, Raveesh Mayya, João Sedoc
Large Language Model Human Annotation Annotation Pipeline

December 17, 2024

December 10, 2024

Granite Guardian
Inkit Padhi, Manish Nagireddy, Giandomenico Cornacchia, Subhajit Chaudhury, Tejaswini Pedapati, Pierre Dognin, Keerthiram Murugesan+15
Detection Model Responsible AI Human Annotation Risk Detection

December 5, 2024

MegaCOIN: Enhancing Medium-Grained Color Perception for Vision-Language Models
Ming-Chang Chiu, Shicheng Wen, Pin-Yu Chen, Xuezhe Ma
Human Labeled Visual Task Full Model Human Annotation Bitcoin User Multimodal Phenomenon Vision Language Model Color Perception

November 29, 2024

Streamlining the review process: AI-generated annotations in research manuscripts
Oscar Díaz, Xabier Garmendia, Juanan Pereira
Peer Review Annotated Chapter Information Manuscript Document Human Reviewer Annotation Tool Human Annotation DH Research Review Mechanism

November 23, 2024

"All that Glitters": Approaches to Evaluations with Unreliable Model and Human Annotations
Michael Hardy
Constructive Approach Human Labeling Human Annotation Human Label Global Evaluation Model Degradation

November 20, 2024

Prior-based Objective Inference Mining Potential Uncertainty for Facial Expression Recognition
Hanwei Liu, Huiling Cai, Qingcheng Lin, Xuefeng Li, Hui Xiao
Emotion Inference Facial Expression Recognition Expression Recognition Human Annotation

November 19, 2024

Auto-Evaluation with Few Labels through Post-hoc Regression
Benjamin Eyre, David Madras
Post Hoc Large Generative Model Automatic Evaluation Predictive Inference Human Annotation Robust Regression

November 13, 2024

Robustness and Confounders in the Demographic Alignment of LLMs with Human Perceptions of Offensiveness
Shayan Alipour, Indira Sen, Mattia Samory, Tanushree Mitra
Text Based Confounders Human Perception Human Annotation Demographic Bias Native Robustness Large Language Model Offensive Language

November 3, 2024

Diagnosing Medical Datasets with Training Dynamics
Laura Wenderoth
Challenging Dataset Medical Datasets Training Dynamic Training Data Human Annotation Hard to Classify Instance

October 22, 2024

Altogether: Image Captioning via Re-aligning Alt-text
Hu Xu, Po-Yao Huang, Xiaoqing Ellen Tan, Ching-Feng Yeh, Jacob Kahn, Christine Jou, Gargi Ghosh, Omer Levy, Luke Zettlemoyer, Wen-tau Yih+3
Alt Text Text to Image Generation Human Annotation Image Captioning

Human Annotation

Papers - Page 2

Accelerating Unbiased LLM Evaluation via Synthetic Feedback

InSTA: Towards Internet-Scale Training For Agents

Experimenting with Affective Computing Models in Video Interviews with Spanish-speaking Older Adults

Ensemble of Large Language Models for Curated Labeling and Rating of Free-text Data

HAUR: Human Annotation Understanding and Recognition Through Text-Heavy Images

EvalMuse-40K: A Reliable and Fine-Grained Benchmark with Comprehensive Human Annotations for Text-to-Image Generation Model Evaluation

Fearful Falcons and Angry Llamas: Emotion Category Annotations of Arguments by Humans and LLMs

Monkey Transfer Learning Can Improve Human Pose Estimation

From Human Annotation to LLMs: SILICON Annotation Workflow for Management Research

NAVCON: A Cognitively Inspired and Linguistically Grounded Corpus for Vision and Language Navigation

OmniEval: An Omnidirectional and Automatic RAG Evaluation Benchmark in Financial Domain

Granite Guardian

MegaCOIN: Enhancing Medium-Grained Color Perception for Vision-Language Models

Streamlining the review process: AI-generated annotations in research manuscripts

"All that Glitters": Approaches to Evaluations with Unreliable Model and Human Annotations

Prior-based Objective Inference Mining Potential Uncertainty for Facial Expression Recognition

Auto-Evaluation with Few Labels through Post-hoc Regression

Robustness and Confounders in the Demographic Alignment of LLMs with Human Perceptions of Offensiveness

Diagnosing Medical Datasets with Training Dynamics

Altogether: Image Captioning via Re-aligning Alt-text