Toxicity Classification
Toxicity classification aims to automatically identify harmful or offensive language in text and speech, focusing on improving accuracy and fairness across diverse languages and demographics. Current research emphasizes developing robust models, often leveraging large language models and cross-modal learning techniques (combining text and speech data), while also addressing biases and limitations in existing datasets through innovative data creation methods and improved evaluation benchmarks. This field is crucial for mitigating online harms and fostering safer digital environments, impacting content moderation, social media platforms, and the development of responsible AI systems.
Papers
November 26, 2024
October 18, 2024
June 27, 2024
June 21, 2024
June 14, 2024
April 27, 2024
August 10, 2023