Toxic Language
Toxic language, encompassing hate speech, insults, and other harmful expressions, is a significant concern in online communication, with research focusing on its detection and mitigation. Current efforts utilize large language models (LLMs) like BERT and GPT, along with techniques such as counterfactual generation and attention-weight adjustments, to identify and reduce toxicity in various languages and contexts, including social media, gaming platforms, and machine translation. This research is crucial for creating safer online environments and improving the ethical development and deployment of AI systems, particularly LLMs, which can inadvertently perpetuate or amplify harmful biases.
Papers
October 17, 2024
October 1, 2024
September 25, 2024
July 2, 2024
May 16, 2024
April 29, 2024
April 22, 2024
December 12, 2023
December 9, 2023
November 17, 2023
November 8, 2023
November 3, 2023
October 20, 2023
October 19, 2023
September 8, 2023
August 18, 2023
July 14, 2023
May 19, 2023
May 18, 2023
May 11, 2023