Content Moderation
Content moderation aims to identify and remove harmful or inappropriate content from online platforms, balancing freedom of expression with the need for safe online environments. Current research focuses on leveraging large language models (LLMs) and transformer-based architectures, often incorporating multimodal data (text, images, video, audio) and contextual information to improve accuracy and fairness in detection and mitigation of harmful content like hate speech, misinformation, and inappropriate material for children. This field is crucial for maintaining healthy online communities and is driving advancements in AI, particularly in areas like bias detection, explainable AI, and efficient model deployment for resource-constrained environments.
Papers
Bandits for Online Calibration: An Application to Content Moderation on Social Media Platforms
Vashist Avadhanula, Omar Abdul Baki, Hamsa Bastani, Osbert Bastani, Caner Gocmen, Daniel Haimovich, Darren Hwang, Dima Karamshuk, Thomas Leeper, Jiayuan Ma, Gregory Macnamara, Jake Mullett, Christopher Palow, Sung Park, Varun S Rajagopal, Kevin Schaeffer, Parikshit Shah, Deeksha Sinha, Nicolas Stier-Moses, Peng Xu
CoRAL: a Context-aware Croatian Abusive Language Dataset
Ravi Shekhar, Mladen Karan, Matthew Purver