Toxic Speech Detection
Toxic speech detection aims to automatically identify harmful or offensive language online, focusing on improving accuracy and mitigating biases inherent in existing models. Current research emphasizes developing more transparent and adaptable models, often incorporating techniques like knowledge distillation and attention mechanisms, while also addressing the challenges of detecting implicit toxicity and adversarial attacks (e.g., intentionally misspelled words). This field is crucial for creating safer online environments and fostering equitable content moderation practices, driving ongoing efforts to create more robust, unbiased, and explainable detection systems.
Papers
January 1, 2025
November 17, 2024
October 8, 2024
June 21, 2024
March 25, 2024
December 12, 2023
March 18, 2023
November 11, 2022
March 17, 2022