Online Content Moderation

Online content moderation aims to identify and remove harmful content from online platforms, balancing freedom of expression with the need for a safe online environment. Current research focuses on improving automated moderation systems using large language models (LLMs) and multimodal models, often incorporating community rules and adapting models for resource-constrained devices through techniques like low-rank adaptation. This field is crucial for mitigating the spread of harmful content like hate speech, and ongoing work emphasizes improving accuracy, efficiency, and privacy while addressing biases and the unique challenges of different online platforms and content types.

Papers