Oladosu Oladimeji; Olayanju Oladimeji
Abstract
Breast cancer is the second major cause of death and accounts for 16% of all cancer deaths worldwide. Most of the methods of detecting breast cancer are very expensive and difficult to interpret such as mammography. There are also limitations such as cumulative radiation exposure, over-diagnosis, false ...
Read More
Breast cancer is the second major cause of death and accounts for 16% of all cancer deaths worldwide. Most of the methods of detecting breast cancer are very expensive and difficult to interpret such as mammography. There are also limitations such as cumulative radiation exposure, over-diagnosis, false positives and negatives in women with a dense breast which pose certain uncertainties in high-risk population. The objective of this study is Detecting Breast Cancer Through Blood Analysis Data Using Classification Algorithms. This will serve as a complement to these expensive methods. High ranking features were extracted from the dataset. The KNN, SVM and J48 algorithms were used as the training platform to classify 116 instances. Furthermore, 10-fold cross validation and holdout procedures were used coupled with changing of random seed. The result showed that KNN algorithm has the highest and best accuracy of 89.99% and 85.21% for cross validation and holdout procedure respectively. This is followed by the J48 with 84.65% and 75.65% for the two procedures respectively. SVM had 77.58% and 68.69% respectively. Although it was also discovered that Blood Glucose level is a major determinant in detecting breast cancer, it has to be combined with other attributes to make decision as a result of other health issues like diabetes. With the result obtained, women are advised to do regular check-ups including blood analysis in order to know which of the blood components need to be worked on to prevent breast cancer based on the model generated in this study.