Browsing by Author "Dauda, K. A."
Now showing 1 - 10 of 10
Results Per Page
Sort Options
Item Efficient Support Vector Machine Classification of Diffuse Large B-Cell Lymphoma and Follicular Lymphoma mRNA Tissue Samples(Annals Computer Science Series, 2015-04-25) Banjoko, A. W.; Yahya, W. B.; Garba, M. K.; Olaniran, O. R.; Dauda, K. A.; Olorede, K. O.This paper proposes a weighted Support Vector Machine (w-SVM) method for efficient class prediction in binary response data sets. The proposed method was obtained by introducing weights which utilizes the point biserial correlation between each of the predictors and the dichotomized response variable into the standard SVM algorithm to maximize the classification accuracy. The optimal value of the proposed w-SVM cost and each of the kernels parameters were determined by grid search in a 10-fold cross validation resampling method. Monte-Carlo Cross Validation method was employed to examine the predictive power of the proposed method by partitioning the data into train and test samples using different sampling splitting ratios. Application of the proposed method on the simulated data sets yielded high prediction accuracy on the test sample. Results from other performance indices further gave credence to the efficiency of the proposed method. The performance of the proposed method was compared with three of the state-of-the art machine learning methods including the standard SVM and the result showed the superiority of this method over others. Finally, the results generally show that the modified algorithm with Radial Basis Function (RBF) Kernel perform excellently and achieved the best predictive performance than any of the existing classifiers considered.Item Efficient Support Vector Machine Classification of Diffuse Large B-Cell Lymphoma and Follicular Lymphoma mRNA Tissue Samples(Faculty of Computer and Applied Computer Science, Tibiscus University of Timisoara, Romania., 2015) Banjoko, A. W.; Yahya, W. B.; Garba, M. K; Olaniran, O. R.; Dauda, K. A.; Olorede, K. O.In this study, an efficient Support Vector Machine (SVM) algorithm that incorporates feature selection procedure for efficient identification and selection of gene biomarkers that are predictive of Diffuse Large B–Cell Lymphoma (DLBCL) and Follicular Lymphoma (FL) cancer tumor samples is presented. The data employed were published real life microarray cancer data that contained 7,129 gene expression profiles measured on 77 biological samples that comprised 58 DLBCL and 19 FL tissue samples. The dimension reduction approach of the Welch statistic was employed at the feature selection phase of the SVM algorithm. The cost and kernel parameters of the SVM model were tuned over a 10–fold cross-validation to improve the efficiency of the SVM classifier. The entire sample was randomly partitioned into 95% training and 5% test samples. The SVM classifier was trained using Monte Carlo Crossvalidation approach with 1000 replications. The performance of this classifier was assessed on the test samples using misclassification error rate (MER) and other performance measures. The results showed that the SVM classifier is quite efficient by yielding very high prediction accuracy of the tumor samples with fewer differentially expressed genes. The selected gene biomarkers in this work can be subjected to further clinical screening for proper determination of their biological relationship with DLBCL and FL tumour subgroups. However, more studies with large samples might be needed in future to validate the results from this workItem Multiclass Feature Selection and Classification with Support Vector Machine in Genomic Study(Edited Conference Proceedings of the 1st International Conference of the Nigeria Statistical Society (NSS)., 2017) Banjoko, A. W.; Yahya, W. B.; Garba, M. K; Olaniran, O. R.; Amusa, L. B.; Gatta, N. F.; Dauda, K. A.; Olorede, K. O.This study proposes an efficient Support Vector Machine (SVM) algorithm for feature selection and classification of multiclass response group in high dimensional (microarray) data. The Feature selection stage of the algorithm employed the F-statistic of the ANOVA–like testing scheme at some chosen family-wise-error-rate (FWER) to control for the detection of some false positive features. In a 10-fold cross validation, the hyper-parameters of the SVM were tuned to determine the appropriate kernel using one-versus-all approach. The entire simulated dataset was randomly partitioned into 95% training and 5% test sets with the SVM classifier built on the training sets while its prediction accuracy on the response class was assessed on the test sets over 1000 Monte-Carlo cross-validation (MCCV) runs. The classification results of the proposed classifier were assessed using the Misclassification Error Rates (MERs) and other performance indices. Results from the Monte-Carlo study showed that the proposed SVM classifier was quite efficient by yielding high prediction accuracy of the response groups with fewer differentially expressed features than when all the features were employed for classification. The performance of this new method on some published cancer data sets shall be examined vis-à-vis other state-of-the-earth machine learning methods in future works.Item Multiclass Feature Selection and Classification with Support Vector Machine in Genomic Study(Edited Conference Proceedings of the 1st International Conference of the Nigeria Statistical Society (NSS)., 2017) Banjoko, A. W.; Yahya, W. B.; Garba, M. K.; Olaniran, O. R.; Amusa, L. B.; Gatta, N. F.; Dauda, K. A.; Olorede, K. O.This study proposes an efficient Support Vector Machine (SVM) algorithm for feature selection and classification of multiclass response group in high dimensional (microarray) data. The Feature selection stage of the algorithm employed the F-statistic of the ANOVA–like testing scheme at some chosen family-wise-error-rate (FWER) to control for the detection of some false positive features. In a 10-fold cross validation, the hyper-parameters of the SVM were tuned to determine the appropriate kernel using one-versus-all approach. The entire simulated dataset was randomly partitioned into 95% training and 5% test sets with the SVM classifier built on the training sets while its prediction accuracy on the response class was assessed on the test sets over 1000 Monte-Carlo cross-validation (MCCV) runs. The classification results of the proposed classifier were assessed using the Misclassification Error Rates (MERs) and other performance indices. Results from the Monte-Carlo study showed that the proposed SVM classifier was quite efficient by yielding high prediction accuracy of the response groups with fewer differentially expressed features than when all the features were employed for classification. The performance of this new method on some published cancer data sets shall be examined vis-à-vis other state-of-the-earth machine learning methods in future works.Item On the Strength of Agreement Between Students’ Initial and Final Academic Performances in Nigeria University System.(ABACUS, Mathematical Association of Nigeria, Nigeria, 2018) Banjoko, A. W.; Yahya, W. B.; Abiodun, H. S.; Adeleke, M. O; Afolayan, R. B; Garba, M. K.; Olorede, K. O.; Dauda, K. A.This paper examines the strength of agreement between academic performances of students after their first and final years in the University. Academic performances of a total of 886 students that were admitted into various academic programs in the Faculty of Science, University of Ilorin, during the 2008/2009 academic session were followed-up to their year of graduation in 2012. Information on the grade point average (GPA) of students at the end of their first year in 2008, their final cumulative grade point average (CGPA) at the end of their studies in 2012 among others were collected. Results from this study generally showed a fair agreement between students’ initial and final academic performances in Nigeria University system (p < 0.001). It was also found that about 50% of students maintained the classes of degrees they had in their first year till graduation, about 40% of them improved on their performances while the performances of about 7% of them dropped from what they had during their first year. Further results showed that students’ performance is gender sensitive. Specifically, about 45% and 60% of female and male students maintained the classes of degrees they had during their first year in the University, about 50% and 30% of them improved on theirs while about 5% and 10% of them dropped from their initial academic performances at the end of their studies respectively. Finally, students in the Biological Sciences improved on their initial academic performances more than their counterparts in the Physical Sciences. Also, female students improved on their initial academic performances more than their male counterparts. This work will serve as a useful counselling guide to prospective admission seekers into the Universities and all the stakeholders at enhancing students’ academic performances in the University system.Item Partial Least Squares-Based Classification and Selection of Predictive Variables of Crimes against Properties in Nigeria(Edited Conference Proceedings of the 1st International Conference of the Nigeria Statistical Society (NSS)., 2017) Olorede, K. O.; Yahya, W. B.; Garuba, A. O.; Banjoko, A. W.; Dauda, K. A.In this study, the state-of-the-art Partial Least Squares (PLS) based models (PLS-Discriminant analysis (PLS-DA), Sparse PLS-DA (SPLS-DA) and Sparse Generalized PLS (SGPLS)) were employed to model and classify the rate of crimes (low or high) committed against properties across the 36 states in Nigeria and the Federal Capital Territory (FCT). The core variables that are predictive of this crime type in Nigeria were identified using the LASSO penalty method via the PLS. Data on occurrences of cases of offences against property obtained from the data base of Nigerian Police Force were utilized in this study. The missing values due to non-occurrence or non-reportage of crime cases were imputed, using the techniques of multivariate imputation by chained equation. The complete data set were partitioned into training and test sets using 80:20 holdout scheme. The 80% training set was used to build the PLS-based models that were in turn used to predict the overall crime rates of Nigerian cities in the 20% held out test data over 200 Monte-Carlo cross-validation runs. All the PLS-based models yielded good classification of unseen test samples into either of two qualitative classes of high and low crime rates with average Correct Classification Rate (CCR) of 94%. Other performance metrics including sensitivity, specificity, positive and negative predictive values, balance accuracy and diagnostic odds ratio were estimated to further examine their classification efficiencies. The SGPLS identified fewer (just 3 out of 12) core relevant crime variables that are predictive of the overall crime rates in Nigerian states with highest CCR than the SPLS which selected 9 such variables to achieved about the same feat.Item Performance Evaluation of Some Estimators of Linear Models with Collinearity and Non–Gaussian Error(Edited Conference Proceedings of the 1st International Conference of the Nigeria Statistical Society (NSS)., 2017) Yahya, W. B.; Garba, M. K.; Ajayi, A. G.; Dauda, K. A.; Olaniran, O. R.; Gatta, N. F.Among typical challenges in numerous multiple linear regression models are those of multicollinearity and non–normal disturbances which have created undesirable consequences for the ordinary least squares (OLS) estimator which is the popular and naïve technique for estimating linear models. Thus, it appears so critical to combine strategies for estimating regression models in order to muddle through while these challenges are present. In this study, the strength of some methods of estimating classical linear regression model in the presence of multicollinearity and non-normal error structures were investigated. The conventional Least Squares (LS), Ridge Regression (RR), Weighted Ridge (WR), Robust M-estimation (M) and Robust Ridge Regression (RRR) methods taking into accounts M-estimation procedures were considered in this study. Results from Monte-Carlo study revealed the superiority of the RRR estimator over others using Mean Squared Errors (MSE) of parameter estimates and Absolute Bias (AB) as assessment criteria among others over various considerations for the distribution of the disturbance term and levels of multicollinearity. The study concluded that whenever linear regression modeling is intended and multicollinearity among the regressors and non-spherical disturbance structure on the response variable are suspected in a data set, the RRR estimator should be adopted in order to ensure optimal efficiency.Item Survival Analysis with Multivariate Adaptive Regression Splines using Cox-Snell Residual(Faculty of Computer and Applied Computer Science, Tibiscus University of Timisoara, Romania, 2015) Dauda, K. A.; Yahya, W. B.; Banjoko, A. W.Multivariate Adaptive Regression Splines (MARS) are a generalization of stepwise linear regression method that is often employed to improve the efficiency of regression models. It is a useful tool to identify linear/nonlinear and interactions effects between a set of metrical and categorical covariates in regression models. In this study, the use of a modified Cox-Snell Residuals to Survival Analysis with MARS was proposed. The proposed method was compared with Martingale Residual in the Survival MARS setting. These two residual types were used as responses in the Cox proportional hazard modeling in the MARS implementations. Results from simulation studies revealed that the proposed method fitted the data better than the Martingale residual However, further results from Monte-Carlo experiment showed that the two residual types performed better than the classical Cox Proportional Hazard (CPH) method. These methods were applied on real life dataset on Pneumocystis Carinii Pneumonia and all the results obtained actually validated those got from the simulation studiesItem A Test Procedure for Ordered Hypothesis of Population Proportions Against a Control(Turkish Clinical publications, Turkey, 2016) Yahya, W. B.; Olaniran, O. R.; Garba, M. K.; Oloyede, I.; Banjoko, A. W.; Dauda, K. A.; Olorede, K. O.Objective: This paper aims to present a novel procedure for testing a set of population proportions against an ordered alternative with a control. Material and Methods: The distribution of the test statistic for the proposed test was determined theoretically and through Monte-Carlo experiments. The efficiency of the proposed test method was compared with the classical Chi-square test of homogeneity of population proportions using their empirical Type I error rates and powers at various sample sizes. Results: The new test statistic that was developed for testing a set of population proportions against an ordered alternative with a control was found to have a Chi-square distribution with non-integer values degrees of freedom v that depend on the number of population groups k being compared. Table of values of v for comparing up to 26 population groups was constructed while an expression was developed to determine v for cases where k > 26. Further results showed that the new test method is capable of detecting the superiority of a treatment, for instance a new drug type, over some of the existing ones in situations where only the qualitative data on users’ preferences of all the available treatments (drug types) are available. The new test method was found to be relatively more powerful and consistent at estimating the nominal Type I error rates (α), especially at smaller sample sizes than the classical Chi-square test of homogeneity of population proportions. Conclusion: The new test method proposed here could find applications in pharmacology where a newly developed drug might be expected to be more preferred by users than some of the existing ones. This kind of test problem can equally exist in medicine, engineering and humanities in situations where only the qualitative data on users’ preferences of a set of treatments or systems are available.Item The Trade-off between the PLSR and PCR Methods for Modeling Data with Collinear Structure(Nigerian Association of Mathematical Physics, 2017-01-20) Yahya, W. B.; Olorede, K. O.; Garba, M. K.; Banjoko, A. W.; Dauda, K. A.