the latest in aiBeta

Human Judgment

Human judgment, a cornerstone of cognitive science, is being rigorously investigated through its comparison with the outputs of increasingly sophisticated artificial intelligence models, particularly large language models (LLMs). Current research focuses on understanding and mitigating biases in human evaluations of AI-generated content, analyzing the alignment between human and AI judgments across diverse tasks (e.g., text generation, image captioning, question answering), and developing new metrics to better capture the nuances of human perception. These studies are crucial for improving the reliability and trustworthiness of AI systems and for fostering more effective human-AI collaboration in various fields.

50papers

Papers

May 21, 2025

AI vs. Human Judgment of Content Moderation: LLM-as-a-Judge and Ethics-Based Response Refusals
Refusal Response Refusal Training Content Moderation Responsible AI Artificial Intelligence Human Judgment

May 18, 2025

Introspective Growth: Automatically Advancing LLM Expertise in Technology Judgment
Expert Model LLM Explanation User Query Human Judgment Large Language Model

April 15, 2025

Exploring Persona-dependent LLM Alignment for the Moral Machine Experiment
Human Judgment LLM Assisted Decision Making Moral Machine Moral Decision Optical Experiment LLM Persona Moral Dilemma

April 13, 2025

A simulation-heuristics dual-process model for intuitive physics
Intuitive Physic Physical Reasoning Dual Process Simulation Based Approach Human Behavior Human Judgment

March 11, 2025

Explaining Human Preferences via Metrics for Structured 3D Reconstruction
Full State Reconstruction Comprehensive Evaluation 3D Reconstruction Human Judgment 3D Object Metric Library Preference Exploration

March 6, 2025

Maximizing Signal in Human-Model Preference Alignment
Human Feedback Human Judgment Language Generation Generative Model Machine Learning Model

March 4, 2025

Teaching AI to Handle Exceptions: Supervised Fine-Tuning with Human-Aligned Judgment
Chain of Thought Agent System Artificial Intelligence Decision Making Human Judgment Generative AI Public Exception

February 26, 2025

Judge as A Judge: Improving the Evaluation of Retrieval-Augmented Generation through the Judge-Consistency of Large Language Models
Global Evaluation Retrieval Augmented Generation Human Judgment Expert Tax Judge Faithful Generation Fine Tuned Judge Model

February 23, 2025

Users Favor LLM-Generated Content -- Until They Know It's AI
User Sentiment Generated Content Non Negative Textual Response AI Task Artificial Intelligence Human Judgment

February 17, 2025

VAQUUM: Are Vague Quantifiers Grounded in Visual Data?
Visual Context Visual Data Human Judgment Vision and Language Model

February 14, 2025

Benchmarking the rationality of AI decision making using the transitivity axiom
Strong Transitivity Relation Choice Function Reasoning Bias Human AI Decision Making Human Judgment

February 7, 2025

Aligning Black-box Language Models with Human Judgments
Language Model Alignment Large Language Model Human Evaluation Human Judgment LLM a a Judge

January 15, 2025

Assessing the Alignment of FOL Closeness Metrics with Human Judgement
First Order Logic Theorem Provers Alignment Problem Human Judgment

December 8, 2024

Steering Large Language Models to Evaluate and Amplify Creativity
Human Judgment Language Model Sustained Creativity Creative Writing Text Modality

December 4, 2024

A Measure of the System Dependence of Automated Metrics
Automatic Metric Machine Translation Metric Library Human Judgment Dependency Aware Incident Human Evaluation Performance Score

November 13, 2024

Towards Objective and Unbiased Decision Assessments with LLM-Enhanced Hierarchical Attention Networks
AI Workflow Cognitive Bias Real World Decision Human Judgment Unbiased Evaluation Hierarchical Attention

October 31, 2024

An Empirical Analysis of GPT-4V's Performance on Fashion Aesthetic Evaluation
Outfit Completion Empirical Analysis GPT 4 System Performance Fashion Domain Human Judgment Global Evaluation Aesthetic Assessment

October 11, 2024

September 29, 2024

Human Bias in the Face of AI: The Role of Human Judgement in AI Generated Text Evaluation
Human Bias Artificial Intelligence Performance Human Face Human Judgment Artificial Intelligence