Paper ID: 2207.11236
Twitmo: A Twitter Data Topic Modeling and Visualization Package for R
Andreas Buchmüller, Gillian Kant, Christoph Weisser, Benjamin Säfken, Krisztina Kis-Katos, Thomas Kneib
We present Twitmo, a package that provides a broad range of methods to collect, pre-process, analyze and visualize geo-tagged Twitter data. Twitmo enables the user to collect geo-tagged Tweets from Twitter and and provides a comprehensive and user-friendly toolbox to generate topic distributions from Latent Dirichlet Allocations (LDA), correlated topic models (CTM) and structural topic models (STM). Functions are included for pre-processing of text, model building and prediction. In addition, one of the innovations of the package is the automatic pooling of Tweets into longer pseudo-documents using hashtags and cosine similarities for better topic coherence. The package additionally comes with functionality to visualize collected data sets and fitted models in static as well as interactive ways and offers built-in support for model visualizations via LDAvis providing great convenience for researchers in this area. The Twitmo package is an innovative toolbox that can be used to analyze public discourse of various topics, political parties or persons of interest in space and time.
Submitted: Jul 8, 2022