Toxic Language Detection
Toxic language detection aims to automatically identify harmful or offensive language in text and other media, striving for accurate and unbiased identification. Current research focuses on improving model efficiency (e.g., using compact transformer architectures), mitigating biases through techniques like counterfactual causal debiasing and conditional multi-task learning, and enhancing robustness against adversarial attacks. This field is crucial for creating safer online environments and fostering more equitable online interactions, impacting both the development of fairer AI systems and the design of effective content moderation strategies.
Papers
August 30, 2024
August 29, 2024
June 21, 2024
June 3, 2024
May 31, 2024
April 9, 2024
March 2, 2024
December 13, 2023
November 2, 2023
July 7, 2023
June 3, 2023
February 14, 2023
October 19, 2022
November 15, 2021