Twitter Corpus
Twitter corpora are collections of tweets used to train and evaluate natural language processing (NLP) models, primarily focusing on understanding and analyzing the nuances of informal online communication. Current research emphasizes addressing biases stemming from the overrepresentation of standard English and the development of robust models for tasks like personality profiling, cyberbullying detection, and identifying diverse English varieties, often employing transformer-based architectures like BERT. These efforts are crucial for improving the fairness and accuracy of NLP systems, leading to more effective tools for social media analysis and a deeper understanding of online behavior.
Papers
September 6, 2024
January 21, 2024
November 17, 2023
July 23, 2022
April 18, 2022
February 21, 2022