Efficient Support Vector Machine Classification of Diffuse Large B-Cell Lymphoma and Follicular Lymphoma mRNA Tissue Samples
No Thumbnail Available
Date
2015
Journal Title
Journal ISSN
Volume Title
Publisher
Faculty of Computer and Applied Computer Science, Tibiscus University of Timisoara, Romania.
Abstract
In this study, an efficient Support Vector
Machine (SVM) algorithm that incorporates feature
selection procedure for efficient identification and
selection of gene biomarkers that are predictive of Diffuse
Large B–Cell Lymphoma (DLBCL) and Follicular
Lymphoma (FL) cancer tumor samples is presented. The
data employed were published real life microarray cancer
data that contained 7,129 gene expression profiles
measured on 77 biological samples that comprised 58
DLBCL and 19 FL tissue samples. The dimension
reduction approach of the Welch statistic was employed at
the feature selection phase of the SVM algorithm. The
cost and kernel parameters of the SVM model were tuned
over a 10–fold cross-validation to improve the efficiency
of the SVM classifier. The entire sample was randomly
partitioned into 95% training and 5% test samples. The
SVM classifier was trained using Monte Carlo Crossvalidation approach with 1000 replications. The
performance of this classifier was assessed on the test
samples using misclassification error rate (MER) and
other performance measures. The results showed that the
SVM classifier is quite efficient by yielding very high
prediction accuracy of the tumor samples with fewer
differentially expressed genes. The selected gene
biomarkers in this work can be subjected to further
clinical screening for proper determination of their
biological relationship with DLBCL and FL tumour subgroups. However, more studies with large samples might
be needed in future to validate the results from this work
Description
Keywords
SVM, Diffuse Large B-Cell Lymphoma, Follicular Lymphoma, 10-fold cross-validation, Welch Statistic