Paper ID: 2403.01318

High-Dimensional Tail Index Regression: with An Application to Text Analyses of Viral Posts in Social Media

Yuya Sasaki, Jing Tao, Yulong Wang

Motivated by the empirical observation of power-law distributions in the credits (e.g., "likes") of viral social media posts, we introduce a high-dimensional tail index regression model and propose methods for estimation and inference of its parameters. First, we present a regularized estimator, establish its consistency, and derive its convergence rate. Second, we introduce a debiasing technique for the regularized estimator to facilitate inference and prove its asymptotic normality. Third, we extend our approach to handle large-scale online streaming data using stochastic gradient descent. Simulation studies corroborate our theoretical findings. We apply these methods to the text analysis of viral posts on X (formerly Twitter) related to LGBTQ+ topics.

Submitted: Mar 2, 2024