External Feedback

External feedback, crucial for guiding machine learning models towards desired behaviors, is a central focus of current research. Studies explore various feedback modalities, including comparative preferences, step-level explanations, and even crowdsourced large language model (LLM) assessments, employing techniques like Proximal Policy Optimization (PPO) and novel algorithms designed for relative feedback. This research aims to improve model performance, efficiency, and alignment with human values, impacting fields ranging from robotics and reinforcement learning to natural language processing and human-computer interaction. The ultimate goal is to create more robust, adaptable, and human-centered AI systems.

Papers