Paper ID: 2403.12009
Leveraging Spatial and Semantic Feature Extraction for Skin Cancer Diagnosis with Capsule Networks and Graph Neural Networks
K. P. Santoso, R. V. H. Ginardi, R. A. Sastrowardoyo, F. A. Madany
In the realm of skin lesion image classification, the intricate spatial and semantic features pose significant challenges for conventional Convolutional Neural Network (CNN)-based methodologies. These challenges are compounded by the imbalanced nature of skin lesion datasets, which hampers the ability of models to learn minority class features effectively. Despite augmentation strategies, such as those using Generative Adversarial Networks (GANs), previous attempts have not fully addressed these complexities. This study introduces an innovative approach by integrating Graph Neural Networks (GNNs) with Capsule Networks to enhance classification performance. GNNs, known for their proficiency in handling graph-structured data, offer an advanced mechanism for capturing complex patterns and relationships beyond the capabilities of traditional CNNs. Capsule Networks further contribute by providing superior recognition of spatial hierarchies within images. Our research focuses on evaluating and enhancing the Tiny Pyramid Vision GNN (Tiny Pyramid ViG) architecture by incorporating it with a Capsule Network. This hybrid model was applied to the MNIST:HAM10000 dataset, a comprehensive skin lesion dataset designed for benchmarking classification models. After 75 epochs of training, our model achieved a significant accuracy improvement, reaching 89.23% and 95.52%, surpassing established benchmarks such as GoogLeNet (83.94%), InceptionV3 (86.82%), MobileNet V3 (89.87%), EfficientNet-B7 (92.07%), ResNet18 (92.22%), ResNet34 (91.90%), ViT-Base (73.70%), and IRv2-SA (93.47%) on the same dataset. This outcome underscores the potential of our approach in overcoming the inherent challenges of skin lesion classification, contributing to the advancement of image-based diagnosis in dermatology.
Submitted: Mar 18, 2024