Paper ID: 2410.18879

Multi-Class Abnormality Classification in Video Capsule Endoscopy Using Deep Learning

Arnav Samal, Ranya

This report outlines Team Seq2Cure's deep learning approach for the Capsule Vision 2024 Challenge, leveraging an ensemble of convolutional neural networks (CNNs) and transformer-based architectures for multi-class abnormality classification in video capsule endoscopy frames. The dataset comprised over 50,000 frames from three public sources and one private dataset, labeled across 10 abnormality classes. To overcome the limitations of traditional CNNs in capturing global context, we integrated CNN and transformer models within a multi-model ensemble. Our approach achieved a balanced accuracy of 86.34 percent and a mean AUC-ROC score of 0.9908 on the validation set, with significant improvements in classifying complex abnormalities. Code is available at this http URL .

Submitted: Oct 24, 2024

Topics

Deep Learning
Convolutional Neural Network
Deep Learning Approach
Capsule Endoscopy
Traditional CNNs
Disease Detection Model
Video Capsule Endoscopy

Links

arXiv PDF