Paper ID: 2405.16286
Generation of synthetic data using breast cancer dataset and classification with resnet18
Dilsat Berin Aytar, Semra Gunduc
Since technology is advancing so quickly in the modern era of information, data is becoming an essential resource in many fields. Correct data collection, organization, and analysis make it a potent tool for successful decision-making, process improvement, and success across a wide range of sectors. Synthetic data is required for a number of reasons, including the constraints of real data, the expense of collecting labeled data, and privacy and security problems in specific situations and domains. For a variety of reasons, including security, ethics, legal restrictions, sensitivity and privacy issues, and ethics, synthetic data is a valuable tool, particularly in the health sector. A deep learning model called GAN (Generative Adversarial Networks) has been developed with the intention of generating synthetic data. In this study, the Breast Histopathology dataset was used to generate malignant and negatively labeled synthetic patch images using MSG-GAN (Multi-Scale Gradients for Generative Adversarial Networks), a form of GAN, to aid in cancer identification. After that, the ResNet18 model was used to classify both synthetic and real data via Transfer Learning. Following the investigation, an attempt was made to ascertain whether the synthetic images behaved like the real data or if they are comparable to the original data.
Submitted: May 25, 2024