• Home
  • Browse
    • Current Issue
    • By Issue
    • By Author
    • By Subject
    • Author Index
    • Keyword Index
  • Journal Info
    • About Journal
    • Aims and Scope
    • Editorial Board
    • Editorial Staff
    • Publication Ethics
    • Indexing and Abstracting
    • Related Links
    • FAQ
    • Peer Review Process
    • News
  • Guide for Authors
  • Submit Manuscript
  • Reviewers
  • Contact Us
 
  • Login
  • Register
Home Articles List Article Information
  • Save Records
  • |
  • Printable Version
  • |
  • Recommend
  • |
  • How to cite Export to
    RIS EndNote BibTeX APA MLA Harvard Vancouver
  • |
  • Share Share
    CiteULike Mendeley Facebook Google LinkedIn Twitter Telegram
Journal of AI and Data Mining
Articles in Press
Current Issue
Journal Archive
Volume Volume 6 (2018)
Issue Issue 2
Issue Issue 1
Volume Volume 5 (2017)
Volume Volume 4 (2016)
Volume Volume 3 (2015)
Volume Volume 2 (2014)
Volume Volume 1 (2013)
Miri Rostami, S., Ahmadzadeh, M. (2018). Extracting Predictor Variables to Construct Breast Cancer Survivability Model with Class Imbalance Problem. Journal of AI and Data Mining, 6(2), 263-276. doi: 10.22044/jadm.2017.5061.1609
S. Miri Rostami; M. Ahmadzadeh. "Extracting Predictor Variables to Construct Breast Cancer Survivability Model with Class Imbalance Problem". Journal of AI and Data Mining, 6, 2, 2018, 263-276. doi: 10.22044/jadm.2017.5061.1609
Miri Rostami, S., Ahmadzadeh, M. (2018). 'Extracting Predictor Variables to Construct Breast Cancer Survivability Model with Class Imbalance Problem', Journal of AI and Data Mining, 6(2), pp. 263-276. doi: 10.22044/jadm.2017.5061.1609
Miri Rostami, S., Ahmadzadeh, M. Extracting Predictor Variables to Construct Breast Cancer Survivability Model with Class Imbalance Problem. Journal of AI and Data Mining, 2018; 6(2): 263-276. doi: 10.22044/jadm.2017.5061.1609

Extracting Predictor Variables to Construct Breast Cancer Survivability Model with Class Imbalance Problem

Article 3, Volume 6, Issue 2, Summer and Autumn 2018, Page 263-276  XML PDF (1188 K)
Document Type: Original Manuscript
DOI: 10.22044/jadm.2017.5061.1609
Authors
S. Miri Rostami ; M. Ahmadzadeh
Faculty of computer and IT Engineering, Shiraz University of Technology, Shiraz, Iran.
Abstract
Application of data mining methods as a decision support system has a great benefit to predict survival of new patients. It also has a great potential for health researchers to investigate the relationship between risk factors and cancer survival. But due to the imbalanced nature of datasets associated with breast cancer survival, the accuracy of survival prognosis models is a challenging issue for researchers. This study aims to develop a predictive model for 5-year survivability of breast cancer patients and discover relationships between certain predictive variables and survival. The dataset was obtained from SEER database. First, the effectiveness of two synthetic oversampling methods Borderline SMOTE and Density based Synthetic Oversampling method (DSO) is investigated to solve the class imbalance problem. Then a combination of particle swarm optimization (PSO) and Correlation-based feature selection (CFS) is used to identify most important predictive variables. Finally, in order to build a predictive model three classifiers decision tree (C4.5), Bayesian Network, and Logistic Regression are applied to the cleaned dataset. Some assessment metrics such as accuracy, sensitivity, specificity, and G-mean are used to evaluate the performance of the proposed hybrid approach. Also, the area under ROC curve (AUC) is used to evaluate performance of feature selection method. Results show that among all combinations, DSO + PSO_CFS + C4.5 presents the best efficiency in criteria of accuracy, sensitivity, G-mean and AUC with values of 94.33%, 0.930, 0.939 and 0.939, respectively.
Keywords
breast cancer; survival; class imbalance problem; oversampling technique; Feature selection
Main Subjects
F.4.17. Survival analysis
Statistics
Article View: 451
PDF Download: 322
Home | Glossary | News | Aims and Scope | Sitemap
Top Top

free analytics


Creative Commons License
JAD is licensed under a Creative Commons Attribution 4.0 International License.

Journal Management System. Designed by sinaweb.