Taboo Classification
Taboo classification focuses on automatically identifying offensive or hateful language, a task complicated by the subjective and context-dependent nature of "taboo." Current research emphasizes mitigating biases in these systems, particularly those stemming from skewed training data that disproportionately affect minority groups, often employing community-specific classifiers to address this. This work also explores zero-shot learning approaches, leveraging large language models to classify taboo content without explicit training on specific taboo terms, and investigates the underlying mechanisms of these models' performance. Improving the accuracy and fairness of taboo classification is crucial for building ethical and inclusive AI systems across various applications.