Paper ID: 2305.18430
Scalable and Weakly Supervised Bank Transaction Classification
Liam Toran, Cory Van Der Walt, Alan Sammarone, Alex Keller
This paper aims to categorize bank transactions using weak supervision, natural language processing, and deep neural network techniques. Our approach minimizes the reliance on expensive and difficult-to-obtain manual annotations by leveraging heuristics and domain knowledge to train accurate transaction classifiers. We present an effective and scalable end-to-end data pipeline, including data preprocessing, transaction text embedding, anchoring, label generation, discriminative neural network training, and an overview of the system architecture. We demonstrate the effectiveness of our method by showing it outperforms existing market-leading solutions, achieves accurate categorization, and can be quickly extended to novel and composite use cases. This can in turn unlock many financial applications such as financial health reporting and credit risk assessment.
Submitted: May 28, 2023