Topic Modeling

Topic modeling is a machine learning technique used to discover underlying themes (topics) within large collections of text data, aiming to provide a structured and interpretable summary of the information. Current research focuses on improving topic coherence and interpretability, particularly for short texts and multilingual data, often employing advanced models like BERT and other transformer-based architectures, variational autoencoders, and graph neural networks alongside traditional methods such as LDA and NMF. These advancements are enhancing the utility of topic modeling across diverse fields, from social media analysis and fake news detection to scientific literature review and legal document organization, by providing more accurate and insightful thematic representations of complex textual data. Furthermore, the integration of large language models is significantly improving topic labeling and evaluation.

Papers

October 3, 2024

Embedded Topic Models Enhanced by Wikification
Takashi Shibuya, Takehito Utsuro
Topic Modeling Topic Model Neural Topic Model

October 1, 2024

AutoTM 2.0: Automatic Topic Modeling Framework for Documents Analysis
Maria Khodorchenko, Nikolay Butakov, Maxim Zuev, Denis Nasonov
Topic Modeling Topic Model Document Analysis Exploratory Data Analysis New Metric

September 30, 2024

Semantic-Driven Topic Modeling Using Transformer-Based Embeddings and Clustering Algorithms
Melkamu Abay Mersha, Mesay Gemeda yigezu, Jugal Kalita
Topic Modeling Topic Model Transformer Embeddings Topic Modelling

September 28, 2024

Investigating the Impact of Text Summarization on Topic Modeling
Trishia Khandelwal
Global Impact Text Summarization Topic Modeling Topic Model Neural Topic Model

August 13, 2024

Generative AI for automatic topic labelling
Diego Kozlowski, Carolina Pradier, Pierre Benz
Generative AI Topic Modeling Topic Detection Research Trend Research Topic Scientific Field Topic Label

August 11, 2024

Iterative Improvement of an Additively Regularized Topic Model
Alex Gorbulev, Vasiliy Alekseev, Konstantin Vorontsov
Topic Modeling Topic Model Simultaneous Improvement

August 6, 2024

Topic Modeling with Fine-tuning LLMs and Bag of Sentences
Johannes Schneider
Large Language Model Topic Modeling Topic Model Bag Prototype Fine Tuned LLM Topic Distribution Non Pun Sentence

July 29, 2024

TopicTag: Automatic Annotation of NMF Topic Models Using Chain of Thought and Prompt Tuning with LLMs
Selma Wanna, Ryan Barron, Nick Solovyev, Maksim E. Eren, Manish Bhattarai, Kim Rasmussen, Boian S. Alexandrov
Large Language Model Prompt Tuning Matrix Factorization Topic Modeling Explicit in Document Tagging Latent Topic Automatic Annotation Negative Matrix Factorization Subject Heading

July 25, 2024

An Iterative Approach to Topic Modelling
Albert Wong, Florence Wing Yau Cheng, Ashley Keung, Yamileth Hercules, Mary Alexandra Garcia, Yew-Wei Lim, Lien Pham
Topic Modeling Text Data Iterative Approach 19 Dataset

July 11, 2024

June 28, 2024

Interactive Topic Models with Optimal Transport
Garima Dhanania, Sheshera Mysore, Chau Minh Pham, Mohit Iyyer, Hamed Zamani, Andrew McCallum
Optimal Transport Topic Modeling Topic Model Latent Topic Document Comparison

June 19, 2024

Mining United Nations General Assembly Debates
Mateusz Grzyb, Mateusz Krzyziński, Bartłomiej Sobieski, Mikołaj Spytek, Bartosz Pieliński, Daniel Dan, Anna Wróblewska
Natural Language Processing Sentiment Analysis Topic Modeling

June 13, 2024

June 3, 2024

Towards Transparency: Exploring LLM Trainings Datasets through Visual Topic Modeling and Semantic Frame
Charles de Dampierre, Andrei Mogoutov, Nicolas Baumard
Transparency Index Topic Modeling Text Datasets LLM Training Preference Datasets Semantic Frame

June 2, 2024

Comprehensive Evaluation of Large Language Models for Topic Modeling
Tomoki Doi, Masaru Isonuma, Hitomi Yanaka
Topic Modeling Comprehensive Evaluation Topic Label

Topic Modeling

Papers

Embedded Topic Models Enhanced by Wikification

AutoTM 2.0: Automatic Topic Modeling Framework for Documents Analysis

Semantic-Driven Topic Modeling Using Transformer-Based Embeddings and Clustering Algorithms

Investigating the Impact of Text Summarization on Topic Modeling

Beyond Text-to-Text: An Overview of Multimodal and Generative Artificial Intelligence for Education Using Topic Modeling

Beats of Bias: Analyzing Lyrics with Topic Modeling and Gender Bias Measurements

Qualitative Insights Tool (QualIT): LLM Enhanced Topic Modeling

Generative AI for automatic topic labelling

Iterative Improvement of an Additively Regularized Topic Model

Topic Modeling with Fine-tuning LLMs and Bag of Sentences

TopicTag: Automatic Annotation of NMF Topic Models Using Chain of Thought and Prompt Tuning with LLMs

An Iterative Approach to Topic Modelling

Unveiling the Potential of BERTopic for Multilingual Fake News Analysis -- Use Case: Covid-19

Unveiling Disparities in Maternity Care: A Topic Modelling Approach to Analysing Maternity Incident Investigation Reports

Interactive Topic Models with Optimal Transport

Mining United Nations General Assembly Debates

$S^3$ -- Semantic Signal Separation

LLM Reading Tea Leaves: Automatically Evaluating Topic Models with Large Language Models

Towards Transparency: Exploring LLM Trainings Datasets through Visual Topic Modeling and Semantic Frame

Comprehensive Evaluation of Large Language Models for Topic Modeling