ADABOOST Ensemble Algorithms for Breast Cancer Classification
No Thumbnail Available
Date
2019
Journal Title
Journal ISSN
Volume Title
Publisher
Journal of Advances in Computer Research
Abstract
With advances in technologies, different tumor features have been collected for
Breast Cancer (BC) diagnosis. The process of dealing with large data set suffers
some challenges which include high storage capacity and time required for
accessing and processing. The objective of this paper is to classify BC based on the
extracted tumor features and to develop an ADABOOST ensemble Model to extract
useful information and diagnose the tumor. In this research work, both
homogeneous and heterogeneous ensemble classifiers (combining two different
classifiers together) were implemented, and Synthetic Minority Over-Sampling
Technique (SMOTE) data mining pre-processing is used to deal with the class
imbalance problem and noise in the dataset. In this paper, the proposed method
involve two steps. The first step employs SMOTE to reduce the effect of data
imbalance in the dataset. The second step involves classifying using decision
algorithms (ADTree, CART, REPTree and Random Forest), Naïve Bayes and their
Ensembles. The experiment was implemented on WEKA Explore (Weka 3.6).
Experimental results show that ADABOOST-Random forest classifies better than
other classification algorithms with 82.52% accuracy, followed by Random Forest-
CART with 72.73% accuracy while Naïve Bayes classification is the lowest with
35.70% accuracy.
Description
Keywords
Breast Cancer, ADABOOST, Synthetic Minority over Sampling Technique, Random Forest, Ensemble