Human Feedback
Human feedback is crucial for aligning artificial intelligence models, particularly large language models, with human preferences and values. Current research focuses on improving the efficiency and reliability of incorporating human feedback into reinforcement learning frameworks, exploring techniques like macro actions, active learning, and reward model optimization to address challenges such as the cost and subjectivity of human judgments. This work is significant because it directly impacts the safety, trustworthiness, and overall effectiveness of AI systems across diverse applications, from autonomous driving to educational assessment. The development of more robust and efficient methods for integrating human feedback is a key area of ongoing investigation.
Papers
STACKFEED: Structured Textual Actor-Critic Knowledge Base Editing with FeedBack
Naman Gupta, Shashank Kirtania, Priyanshu Gupta, Krishna Kariya, Sumit Gulwani, Arun Iyer, Suresh Parthasarathy, Arjun Radhakrishna, Sriram K. Rajamani, Gustavo Soares
Feedback Favors the Generalization of Neural ODEs
Jindou Jia, Zihan Yang, Meng Wang, Kexin Guo, Jianfei Yang, Xiang Yu, Lei Guo
Automated Filtering of Human Feedback Data for Aligning Text-to-Image Diffusion Models
Yongjin Yang, Sihyeon Kim, Hojung Jung, Sangmin Bae, SangMook Kim, Se-Young Yun, Kimin Lee
Preference Fine-Tuning for Factuality in Chest X-Ray Interpretation Models Without Human Feedback
Dennis Hein, Zhihong Chen, Sophie Ostmeier, Justin Xu, Maya Varma, Eduardo Pontes Reis, Arne Edward Michalson, Christian Bluethgen, Hyun Joo Shin, Curtis Langlotz, Akshay S Chaudhari
MotionRL: Align Text-to-Motion Generation to Human Preferences with Multi-Reward Reinforcement Learning
Xiaoyang Liu, Yunyao Mao, Wengang Zhou, Houqiang Li
How Reliable Is Human Feedback For Aligning Large Language Models?
Min-Hsuan Yeh, Leitian Tao, Jeffrey Wang, Xuefeng Du, Yixuan Li
Seeing Eye to AI: Human Alignment via Gaze-Based Response Rewards for Large Language Models
Angela Lopez-Cardona, Carlos Segura, Alexandros Karatzoglou, Sergi Abadal, Ioannis Arapakis