Policy Value

Policy value research focuses on aligning artificial intelligence systems, particularly large language models (LLMs), with human values and societal norms. Current research emphasizes developing robust evaluation frameworks and benchmarks to assess this alignment across diverse contexts, employing techniques like Bayesian inverse reinforcement learning and generative evolving testing, as well as exploring the use of transformer-based models for imputation of missing data in value-related datasets. This work is crucial for mitigating potential harms from AI systems and ensuring responsible development and deployment, impacting fields ranging from news recommendation to healthcare and education.

Papers

June 20, 2024

Raising the Bar: Investigating the Values of Large Language Models via Generative Evolving Testing
Han Jiang, Xiaoyuan Yi, Zhihua Wei, Shu Wang, Xing Xie
Policy Value Value Alignment Novel Evaluation Four Bar

June 16, 2024

The Potential and Challenges of Evaluating Attitudes, Opinions, and Values in Large Language Models
Bolei Ma, Xinpeng Wang, Tiancheng Hu, Anna-Carolina Haensch, Michael A. Hedderich, Barbara Plank, Frauke Kreuter
Technical Challenge Policy Value Human Opinion Psychological Trait Human AI Alignment Evaluation Funnel

June 14, 2024

Off-Policy Evaluation from Logged Human Feedback
Aniruddha Bhargava, Lalit Jain, Branislav Kveton, Ge Liu, Subhojyoti Mukherjee
Machine Learning Human Feedback Policy Evaluation Policy Value External Feedback

April 7, 2024

Ethos and Pathos in Online Group Discussions: Corpora for Polarisation Issues in Social Media
Ewelina Gajewska, Katarzyna Budzynska, Barbara Konat, Marcin Koszowy, Konrad Kiljan, Maciej Uberna, He Zhang
Large Corpus Social Medium Policy Value Political Polarization Persuasive Capability Rhetorical Structure Group Conversation

March 24, 2024

Specifying Agent Ethics (Blue Sky Ideas)
Louise A. Dennis, Michael Fisher
Policy Value Different Stakeholder Moral Dilemma Research Ethic Machine Ethic Moral Agent Innovative Idea

March 13, 2024

March 6, 2024

Tackling Missing Values in Probabilistic Wind Power Forecasting: A Generative Approach
Honglin Wen, Pierre Pinson, Jie Gu, Zhijian Jin
Generative Approach Parameter Estimation Policy Value Wind Power Forecasting Task Probabilistic Forecasting

March 1, 2024

Authors' Values and Attitudes Towards AI-bridged Scalable Personalization of Creative Language Arts
Taewook Kim, Hyomin Han, Eytan Adar, Matthew Kay, John Joon Young Chung
Artificial Intelligence Generative AI Policy Value Author Name Scalable Personalization Creative Writing

February 26, 2024

Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models
Paul Röttger, Valentin Hofmann, Valentina Pyatkin, Musashi Hinck, Hannah Rose Kirk, Hinrich Schütze, Dirk Hovy
Large Language Model Language Model Multiple Choice Policy Value Human Opinion Political Compass

February 13, 2024

Values That Are Explicitly Present in Fairy Tales: Comparing Samples from German, Italian and Portuguese Traditions
Alba Morollon Diaz-Faes, Carla Sofia Ribeiro Murteira, Martin Ruskov
External Sample Policy Value Brazilian Portuguese Political Event Fairy Tale Collective Memory Cultural Understanding

January 16, 2024

ValUES: A Framework for Systematic Validation of Uncertainty Estimation in Semantic Segmentation
Kim-Celine Kahl, Carsten T. Lüth, Maximilian Zenk, Klaus Maier-Hein, Paul F. Jaeger
Semantic Segmentation New Framework Uncertainty Estimation Model Uncertainty Policy Value Uncertainty Estimation Method Quantitative Validation

December 7, 2023

Model-Based Epistemic Variance of Values for Risk-Aware Policy Optimization
Carlos E. Luis, Alessandro G. Bottero, Julia Vinogradska, Felix Berkenkamp, Jan Peters
Model Based Reinforcement Learning Epistemic Uncertainty Policy Value Bellman Equation Inefficient Exploration Risk Averse Policy

November 13, 2023

Exploring Values in Museum Artifacts in the SPICE project: a Preliminary Study
Nele Kadastik, Thomas A. Pederson, Luis Emilio Bruni, Rossana Damiano, Antonio Lieto, Manuel Striani, Tsvi Kuflik, Alan Wecker
Top Level Ontology Semantic Parsing Commonsense Reasoning Preliminary Study Policy Value Local Culture

November 7, 2023

Discordance Minimization-based Imputation Algorithms for Missing Values in Rating Data
Young Woong Park, Jinhak Kim, Dan Zhu
Imputation Algorithm Policy Value Human Rating Imputation Accuracy Synthetic Preference Imputation Task

November 6, 2023

GQKVA: Efficient Pre-training of Transformers by Grouping Queries, Keys, and Values
Farnoosh Javadi, Walid Ahmed, Habib Hajimolahoseini, Foozhan Ataiefard, Mohammad Hassanpour, Saina Asani, Austin Wen, Omar Mohamed Awad, Kangling Liu, Yang Liu
Transformer Megatron Decepticons Transformer Based Transformer Based Model Policy Value Model Reduction Secret Key OpenQA System Query Aggregation

October 27, 2023

From Values to Opinions: Predicting Human Behaviors and Stances Using Value-Injected Large Language Models
Dongjun Kang, Joonsuk Park, Yohan Jo, JinYeong Bak
Large Language Model Large Scale Human Behavior Policy Value Human Opinion Stance Analysis Argument Generation

October 21, 2023

Values, Ethics, Morals? On the Use of Moral Concepts in NLP Research
Karina Vida, Judith Simon, Anne Lauscher
Language Model NLP Field NLP Research Intercultural Ethic Policy Value Moral Theory Moral Philosophy Moral Concept

October 11, 2023

The Past, Present and Better Future of Feedback Learning in Large Language Models for Subjective Human Preferences and Values
Hannah Rose Kirk, Andrew M. Bean, Bertie Vidgen, Paul Röttger, Scott A. Hale
Language Model Human Feedback Human Preference Policy Value Future Scenario

September 27, 2023

Examining the Values Reflected by Children during AI Problem Formulation
Utkarsh Dwivedi, Salma Elsayed-ali, Elizabeth Bonsignore, Hernisa Kacorri
AI System Nine Year Old Child Policy Value Machine Teaching

Policy Value

Papers

Raising the Bar: Investigating the Values of Large Language Models via Generative Evolving Testing

The Potential and Challenges of Evaluating Attitudes, Opinions, and Values in Large Language Models

Off-Policy Evaluation from Logged Human Feedback

Ethos and Pathos in Online Group Discussions: Corpora for Polarisation Issues in Social Media

Specifying Agent Ethics (Blue Sky Ideas)

Ethos: Rectifying Language Models in Orthogonal Parameter Space

On the Performance of Imputation Techniques for Missing Values on Healthcare Datasets

Tackling Missing Values in Probabilistic Wind Power Forecasting: A Generative Approach

Authors' Values and Attitudes Towards AI-bridged Scalable Personalization of Creative Language Arts

Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models

Values That Are Explicitly Present in Fairy Tales: Comparing Samples from German, Italian and Portuguese Traditions

ValUES: A Framework for Systematic Validation of Uncertainty Estimation in Semantic Segmentation

Model-Based Epistemic Variance of Values for Risk-Aware Policy Optimization

Exploring Values in Museum Artifacts in the SPICE project: a Preliminary Study

Discordance Minimization-based Imputation Algorithms for Missing Values in Rating Data

GQKVA: Efficient Pre-training of Transformers by Grouping Queries, Keys, and Values

From Values to Opinions: Predicting Human Behaviors and Stances Using Value-Injected Large Language Models

Values, Ethics, Morals? On the Use of Moral Concepts in NLP Research

The Past, Present and Better Future of Feedback Learning in Large Language Models for Subjective Human Preferences and Values

Examining the Values Reflected by Children during AI Problem Formulation