Counterspeech Generation
Counterspeech generation uses artificial intelligence to automatically create responses that counteract online hate speech, aiming to mitigate its harmful effects without censorship. Current research focuses on improving the quality and effectiveness of generated counterspeech by incorporating factors like politeness, emotional tone, and targeted refutation of specific hateful elements, often leveraging large language models (LLMs) such as GPT and DialoGPT, and employing techniques like reinforcement learning and multi-task instruction tuning. This field is significant because it offers a non-censorial approach to combating online hate, and advancements in automated counterspeech generation could significantly reduce the burden on human moderators and potentially lessen the spread of harmful content.
Papers
CODEOFCONDUCT at Multilingual Counterspeech Generation: A Context-Aware Model for Robust Counterspeech Generation in Low-Resource Languages
Michael Bennie, Bushi Xiao, Chryseis Xinyi Liu, Demi Zhang, Jian Meng, Alayo Tripp
PANDA -- Paired Anti-hate Narratives Dataset from Asia: Using an LLM-as-a-Judge to Create the First Chinese Counterspeech Dataset
Michael Bennie, Demi Zhang, Bushi Xiao, Jing Cao, Chryseis Xinyi Liu, Jian Meng, Alayo Tripp