Hate Speech
Hate speech, encompassing discriminatory and derogatory language targeting individuals or groups, is a significant online problem. Current research focuses on improving automated hate speech detection, employing various deep learning models like BERT, LSTM, and transformer-based architectures, often incorporating multimodal data (text and images) and addressing challenges like implicit hate, code-mixing, and cross-cultural variations. These efforts aim to enhance the accuracy and fairness of hate speech detection systems, ultimately contributing to safer online environments and informing content moderation strategies. The field also explores methods for generating counterspeech and mitigating biases within detection models.
Papers
IndoToxic2024: A Demographically-Enriched Dataset of Hate Speech and Toxicity Types for Indonesian Language
Lucky Susanto, Musa Izzanardi Wijanarko, Prasetia Anugrah Pratama, Traci Hong, Ika Idris, Alham Fikri Aji, Derry Wijaya
Empirical Evaluation of Public HateSpeech Datasets
Sadar Jaf, Basel Barakat
Explainability and Hate Speech: Structured Explanations Make Social Media Moderators Faster
Agostina Calabrese, Leonardo Neves, Neil Shah, Maarten W. Bos, Björn Ross, Mirella Lapata, Francesco Barbieri
Tox-BART: Leveraging Toxicity Attributes for Explanation Generation of Implicit Hate Speech
Neemesh Yadav, Sarah Masud, Vikram Goyal, Vikram Goyal, Md Shad Akhtar, Tanmoy Chakraborty