Paper ID: 2411.06772

A Text Classification Model Combining Adversarial Training with Pre-trained Language Model and neural networks: A Case Study on Telecom Fraud Incident Texts

Liu Zhuoxian, Shi Tuo, Hu Xiaofeng

Front-line police officers often categorize all police call reported cases of Telecom Fraud into 14 subcategories to facilitate targeted prevention measures, such as precise public education. However, the associated data is characterized by its large volume, diverse information content, and variations in expression. Currently, there is a lack of efficient and accurate intelligent models to replace manual classification, which, while precise, is relatively inefficient. To address these challenges, this paper proposes a text classification model that combines adversarial training with Pre-trained Language Model and neural networks. The Linguistically-motivated Pre-trained Language Model model extracts three types of language features and then utilizes the Fast Gradient Method algorithm to perturb the generated embedding layer. Subsequently, the Bi-directional Long Short-Term Memory and Convolutional Neural Networks networks extract contextual syntactic information and local semantic information, respectively. The model achieved an 83.9% classification accuracy when trained on a portion of telecom fraud case data provided by the operational department. The model established in this paper has been deployed in the operational department, freeing up a significant amount of manpower and improving the department's efficiency in combating Telecom Fraud crimes. Furthermore, considering the universality of the model established in this paper, other application scenarios await further exploration.

Submitted: Nov 11, 2024