Paper ID: 2209.05185
Open-Domain Dialog Evaluation using Follow-Ups Likelihood
Maxime De Bruyn, Ehsan Lotfi, Jeska Buhmann, Walter Daelemans
Automatic evaluation of open-domain dialogs remains an unsolved problem. Moreover, existing methods do not correlate strongly with human annotations. This paper presents a new automated evaluation method using follow-ups: we measure the probability that a language model will continue the conversation with a fixed set of follow-ups (e.g., not really relevant here, what are you trying to say). When compared against twelve existing methods, our new evaluation achieves the highest correlation with human evaluations.
Submitted: Sep 12, 2022