Pairwise Preference

Pairwise preference learning focuses on training models to predict preferences between pairs of items, often using human feedback or automatically generated comparisons. Current research emphasizes improving the efficiency and robustness of these methods, particularly for large language models, by incorporating richer feedback (beyond simple binary preferences), addressing intransitivity issues, and mitigating biases in both human and AI-generated preferences. This field is crucial for advancing AI alignment, improving the quality of AI-generated content, and enabling more effective human-computer interaction in various applications, including machine translation, search engines, and interactive reinforcement learning.

Papers