Paper ID: 2409.12658

Exploring the topics, sentiments and hate speech in the Spanish information environment

ALEJANDRO BUITRAGO LOPEZ, Javier Pastor-Galindo, José Antonio Ruipérez-Valiente

In the digital era, the internet and social media have transformed communication but have also facilitated the spread of hate speech and disinformation, leading to radicalization, polarization, and toxicity. This is especially concerning for media outlets due to their significant role in shaping public discourse. This study examines the topics, sentiments, and hate prevalence in 337,807 response messages (website comments and tweets) to news from five Spanish media outlets (La Vanguardia, ABC, El País, El Mundo, and 20 Minutos) in January 2021. These public reactions were originally labeled as distinct types of hate by experts following an original procedure, and they are now classified into three sentiment values (negative, neutral, or positive) and main topics. The BERTopic unsupervised framework was used to extract 81 topics, manually named with the help of Large Language Models (LLMs) and grouped into nine primary categories. Results show social issues (22.22%), expressions and slang (20.35%), and political issues (11.80%) as the most discussed. Content is mainly negative (62.7%) and neutral (28.57%), with low positivity (8.73%). Toxic narratives relate to conversation expressions, gender, feminism, and COVID-19. Despite low levels of hate speech (3.98%), the study confirms high toxicity in online responses to social and political topics.

Submitted: Sep 19, 2024