Human Alignment

Human alignment in artificial intelligence focuses on aligning the behavior and outputs of large language models (LLMs) and other AI systems with human values and preferences. Current research emphasizes methods like reinforcement learning from human feedback (RLHF), direct preference optimization (DPO), and contrastive learning, often incorporating diverse data sources such as eye-tracking and preference rankings to improve model training and evaluation. This work is crucial for ensuring the safety, reliability, and beneficial use of increasingly powerful AI systems, impacting both the development of more trustworthy AI and the broader understanding of human-computer interaction.

Papers

February 28, 2023

Read Pointer Meters in complex environments based on a Human-like Alignment and Recognition Algorithm
Yan Shu, Shaohui Liu, Honglei Xu, Feng Jiang
Image Recognition Complex Environment Spatial Transformer Network Human Alignment Reading System METER Based Correction Meter Reading

November 2, 2022

Human alignment of neural network representations
Lukas Muttenthaler, Jonas Dippel, Lorenz Linhardt, Robert A. Vandermeulen, Simon Kornblith
Computer Vision Model Neural Network Representation Mental Representation Human Alignment Conceptual Representation Similarity Judgment

October 5, 2022

Every word counts: A multilingual analysis of individual human alignment with model attention
Stephanie Brandl, Nora Hollenstein
Personality Trait Real Text Word Monolingual Data Transformer Attention Human Alignment Human Reading

Human Alignment

Papers

Read Pointer Meters in complex environments based on a Human-like Alignment and Recognition Algorithm

Human alignment of neural network representations

Every word counts: A multilingual analysis of individual human alignment with model attention