Paper ID: 2309.14397

Predicting environment effects on breast cancer by implementing machine learning

Muhammad Shoaib Farooq, Mehreen Ilyas

The biggest Breast cancer is increasingly a major factor in female fatalities, overtaking heart disease. While genetic factors are important in the growth of breast cancer, new research indicates that environmental factors also play a substantial role in its occurrence and progression. The literature on the various environmental factors that may affect breast cancer risk, incidence, and outcomes is thoroughly reviewed in this study report. The study starts by looking at how lifestyle decisions, such as eating habits, exercise routines, and alcohol consumption, may affect hormonal imbalances and inflammation, two important factors driving the development of breast cancer. Additionally, it explores the part played by environmental contaminants such pesticides, endocrine-disrupting chemicals (EDCs), and industrial emissions, all of which have been linked to a higher risk of developing breast cancer due to their interference with hormone signaling and DNA damage. Algorithms for machine learning are used to express predictions. Logistic Regression, Random Forest, KNN Algorithm, SVC and extra tree classifier. Metrics including the confusion matrix correlation coefficient, F1-score, Precision, Recall, and ROC curve were used to evaluate the models. The best accuracy among all the classifiers is Random Forest with 0.91% accuracy and ROC curve 0.901% of Logistic Regression. The accuracy of the multiple algorithms for machine learning utilized in this research was good, which is important and indicates that these techniques could serve as replacement forecasting techniques in breast cancer survival analysis, notably in the Asia region.

Submitted: Sep 25, 2023