Moral Preference Elicitation

Moral preference elicitation aims to quantitatively capture and represent human values for applications like aligning AI systems with human ethics. Current research focuses on improving the reliability and efficiency of elicitation methods, addressing challenges like the instability of moral judgments over time and the limitations of active learning algorithms in this context. These efforts are crucial for developing ethical AI, as accurately reflecting diverse human values is essential for building trustworthy and beneficial AI systems. Ongoing work explores various approaches, including text-based analysis of moral judgments and novel methods for synthesizing diverse value inputs into a coherent representation.

Papers

August 5, 2024

On The Stability of Moral Preferences: A Problem with Computational Elicitation Methods
Kyle Boerstler, Vijay Keswani, Lok Chan, Jana Schaich Borg, Vincent Conitzer, Hoda Heidari, Walter Sinnott-Armstrong
Core Stability Right Problem Moral Judgment Preference Elicitation Moral Preference Elicitation

July 26, 2024

On the Pros and Cons of Active Learning for Moral Preference Elicitation
Vijay Keswani, Vincent Conitzer, Hoda Heidari, Jana Schaich Borg, Walter Sinnott-Armstrong
Active Learning Preference Elicitation Moral Preference Elicitation

May 28, 2024

Decoding moral judgement from text: a pilot study
Diana E. Gherman, Thorsten O. Zander
Text Modality Neural Recording Pilot Study Moral Judgment Text Based Cue Moral Preference Elicitation

March 27, 2024

What are human values, and how do we align AI to them?
Oliver Klingefjord, Ryan Lowe, Joe Edelman
Language Model Artificial Intelligence Human Value Moral Preference Elicitation

Moral Preference Elicitation

Papers

On The Stability of Moral Preferences: A Problem with Computational Elicitation Methods

On the Pros and Cons of Active Learning for Moral Preference Elicitation

Decoding moral judgement from text: a pilot study

What are human values, and how do we align AI to them?