Expectation Alignment
Expectation alignment in artificial intelligence focuses on aligning the behavior of AI agents with the actual expectations of their human users, addressing the problem of reward misspecification and unintended consequences. Current research explores methods for inferring user expectations, often using frameworks like theory of mind and incorporating multiple evaluation metrics beyond simple reward maximization, employing techniques such as linear programming and stochastic dominance. This work is crucial for improving AI safety and reliability, leading to more trustworthy and beneficial AI systems across various applications, from robotics and conversational agents to decision-making systems.
Papers
February 21, 2023
January 21, 2023
October 9, 2022
October 7, 2022
August 24, 2022
July 27, 2022
May 19, 2022
May 2, 2022
April 22, 2022