Paper ID: 2409.04860

Sequential Classification of Misinformation

Daniel Toma, Wasim Huleihel

In recent years there have been a growing interest in online auditing of information flow over social networks with the goal of monitoring undesirable effects, such as, misinformation and fake news. Most previous work on the subject, focus on the binary classification problem of classifying information as fake or genuine. Nonetheless, in many practical scenarios, the multi-class/label setting is of particular importance. For example, it could be the case that a social media platform may want to distinguish between ``true", ``partly-true", and ``false" information. Accordingly, in this paper, we consider the problem of online multiclass classification of information flow. To that end, driven by empirical studies on information flow over real-world social media networks, we propose a probabilistic information flow model over graphs. Then, the learning task is to detect the label of the information flow, with the goal of minimizing a combination of the classification error and the detection time. For this problem, we propose two detection algorithms; the first is based on the well-known multiple sequential probability ratio test, while the second is a novel graph neural network based sequential decision algorithm. For both algorithms, we prove several strong statistical guarantees. We also construct a data driven algorithm for learning the proposed probabilistic model. Finally, we test our algorithms over two real-world datasets, and show that they outperform other state-of-the-art misinformation detection algorithms, in terms of detection time and classification error.

Submitted: Sep 7, 2024