Paper ID: 2207.12406

UrduFake@FIRE2020: Shared Track on Fake News Identification in Urdu

Maaz Amjad, Grigori Sidorov, Alisa Zhila, Alexander Gelbukh, Paolo Rosso

This paper gives the overview of the first shared task at FIRE 2020 on fake news detection in the Urdu language. This is a binary classification task in which the goal is to identify fake news using a dataset composed of 900 annotated news articles for training and 400 news articles for testing. The dataset contains news in five domains: (i) Health, (ii) Sports, (iii) Showbiz, (iv) Technology, and (v) Business. 42 teams from 6 different countries (India, China, Egypt, Germany, Pakistan, and the UK) registered for the task. 9 teams submitted their experimental results. The participants used various machine learning methods ranging from feature-based traditional machine learning to neural network techniques. The best performing system achieved an F-score value of 0.90, showing that the BERT-based approach outperforms other machine learning classifiers.

Submitted: Jul 25, 2022

Topics

Links

arXiv PDF