Content Moderation

Content moderation aims to identify and remove harmful or inappropriate content from online platforms, balancing freedom of expression with the need for safe online environments. Current research focuses on leveraging large language models (LLMs) and transformer-based architectures, often incorporating multimodal data (text, images, video, audio) and contextual information to improve accuracy and fairness in detection and mitigation of harmful content like hate speech, misinformation, and inappropriate material for children. This field is crucial for maintaining healthy online communities and is driving advancements in AI, particularly in areas like bias detection, explainable AI, and efficient model deployment for resource-constrained environments.

Papers