Human Feedback
Human feedback is crucial for aligning artificial intelligence models, particularly large language models, with human preferences and values. Current research focuses on improving the efficiency and reliability of incorporating human feedback into reinforcement learning frameworks, exploring techniques like macro actions, active learning, and reward model optimization to address challenges such as the cost and subjectivity of human judgments. This work is significant because it directly impacts the safety, trustworthiness, and overall effectiveness of AI systems across diverse applications, from autonomous driving to educational assessment. The development of more robust and efficient methods for integrating human feedback is a key area of ongoing investigation.
Papers
Reinforcement Learning from Human Feedback without Reward Inference: Model-Free Algorithm and Instance-Dependent Analysis
Qining Zhang, Honghao Wei, Lei Ying
Joint Learning of Context and Feedback Embeddings in Spoken Dialogue
Livia Qian, Gabriel Skantze
Joint Demonstration and Preference Learning Improves Policy Alignment with Human Feedback
Chenliang Li, Siliang Zeng, Zeyi Liao, Jiaxiang Li, Dongyeop Kang, Alfredo Garcia, Mingyi Hong
Show, Don't Tell: Aligning Language Models with Demonstrated Feedback
Omar Shaikh, Michelle Lam, Joey Hejna, Yijia Shao, Michael Bernstein, Diyi Yang
Inverse Constitutional AI: Compressing Preferences into Principles
Arduin Findeis, Timo Kaufmann, Eyke Hüllermeier, Samuel Albanie, Robert Mullins
Enhancing Zero-shot Text-to-Speech Synthesis with Human Feedback
Chen Chen, Yuchen Hu, Wen Wu, Helin Wang, Eng Siong Chng, Chao Zhang