Topic Modeling
Topic modeling is a machine learning technique used to discover underlying themes (topics) within large collections of text data, aiming to provide a structured and interpretable summary of the information. Current research focuses on improving topic coherence and interpretability, particularly for short texts and multilingual data, often employing advanced models like BERT and other transformer-based architectures, variational autoencoders, and graph neural networks alongside traditional methods such as LDA and NMF. These advancements are enhancing the utility of topic modeling across diverse fields, from social media analysis and fake news detection to scientific literature review and legal document organization, by providing more accurate and insightful thematic representations of complex textual data. Furthermore, the integration of large language models is significantly improving topic labeling and evaluation.
Papers
Unveiling the Potential of BERTopic for Multilingual Fake News Analysis -- Use Case: Covid-19
Karla Schäfer, Jeong-Eun Choi, Inna Vogel, Martin Steinebach
Unveiling Disparities in Maternity Care: A Topic Modelling Approach to Analysing Maternity Incident Investigation Reports
Georgina Cosma, Mohit Kumar Singh, Patrick Waterson, Gyuchan Thomas Jun, Jonathan Back