Paper ID: 2412.20632

EVOLVE: Emotion and Visual Output Learning via LLM Evaluation

Jordan Sinclair, Christopher Reardon

Human acceptance of social robots is greatly effected by empathy and perceived understanding. This necessitates accurate and flexible responses to various input data from the user. While systems such as this can become increasingly complex as more states or response types are included, new research in the application of large language models towards human-robot interaction has allowed for more streamlined perception and reaction pipelines. LLM-selected actions and emotional expressions can help reinforce the realism of displayed empathy and allow for improved communication between the robot and user. Beyond portraying empathy in spoken or written responses, this shows the possibilities of using LLMs in actuated, real world scenarios. In this work we extend research in LLM-driven nonverbal behavior for social robots by considering more open-ended emotional response selection leveraging new advances in vision-language models, along with emotionally aligned motion and color pattern selections that strengthen conveyance of meaning and empathy.

Submitted: Dec 30, 2024