ADABOOST Ensemble Algorithms for Breast Cancer Classification

No Thumbnail Available

Date

2019

Journal Title

Journal ISSN

Volume Title

Publisher

Journal of Advances in Computer Research

Abstract

With advances in technologies, different tumor features have been collected for Breast Cancer (BC) diagnosis. The process of dealing with large data set suffers some challenges which include high storage capacity and time required for accessing and processing. The objective of this paper is to classify BC based on the extracted tumor features and to develop an ADABOOST ensemble Model to extract useful information and diagnose the tumor. In this research work, both homogeneous and heterogeneous ensemble classifiers (combining two different classifiers together) were implemented, and Synthetic Minority Over-Sampling Technique (SMOTE) data mining pre-processing is used to deal with the class imbalance problem and noise in the dataset. In this paper, the proposed method involve two steps. The first step employs SMOTE to reduce the effect of data imbalance in the dataset. The second step involves classifying using decision algorithms (ADTree, CART, REPTree and Random Forest), Naïve Bayes and their Ensembles. The experiment was implemented on WEKA Explore (Weka 3.6). Experimental results show that ADABOOST-Random forest classifies better than other classification algorithms with 82.52% accuracy, followed by Random Forest- CART with 72.73% accuracy while Naïve Bayes classification is the lowest with 35.70% accuracy.

Description

Keywords

Breast Cancer, ADABOOST, Synthetic Minority over Sampling Technique, Random Forest, Ensemble

Citation

Collections