Browsing by Author "Banjoko, A. W."
Now showing 1 - 19 of 19
Results Per Page
Sort Options
Item Competing Risk Modeling Using Cumulative Incidence Function: Application to Recurrent Bladder Cancer data(FUOYE Journal of Engineering and Technology, Federal University of Science and Technology, Oye-Ekiti, Nigeria, 2018) Dauda, K. A; Yahya, W. B.; Banjoko, A. W.; Olorede, K. O.In this study, the effects of some clinical variables on the survival times of patients with bladder cancer were examined. The effects of these variables on sub-distribution of the failure types were determined using the proportional sub-distribution hazards regression model described in Fine and Gray (1999). Published dataset on 294 bladder cancer patients with four clinical outcomes were analyzed using the Cumulative Incidence Function approach. The four outcomes included 184 (64%) patients that experienced recurrence of bladder cancer after receiving chemotherapy treatments. Two patients died of bladder cancer while 27 patients died of other causes and the remaining 76 patients did not experience any of these three outcomes, and as a result, were considered censored. Among the covariates considered, only the patients’ initial number of tumours and initial size of tumour were incorporated into our analysis due to high proportion of missing observations in others. Results from this work showed that patients with tumour recurrence have the highest risk of dying than those from other causes. Further results showed that the number of tumours was positively associated with the recurrence of cancer of the bladder . However, the size of the tumour did not demonstrate a significant effect on the patients’ survival time. It can therefore, be concluded that patients with tumour recurrence have a low probability of survival from bladder cancer than patients that experienced other events. Above all, number, but not size of tumour could adversely affect the survival time of bladder cancer patients, especially those with tumour recurrence after bladder cancer treatment.Item Efficient Support Vector Machine Classification of Diffuse Large B-Cell Lymphoma and Follicular Lymphoma mRNA Tissue Samples(Annals Computer Science Series, 2015-04-25) Banjoko, A. W.; Yahya, W. B.; Garba, M. K.; Olaniran, O. R.; Dauda, K. A.; Olorede, K. O.This paper proposes a weighted Support Vector Machine (w-SVM) method for efficient class prediction in binary response data sets. The proposed method was obtained by introducing weights which utilizes the point biserial correlation between each of the predictors and the dichotomized response variable into the standard SVM algorithm to maximize the classification accuracy. The optimal value of the proposed w-SVM cost and each of the kernels parameters were determined by grid search in a 10-fold cross validation resampling method. Monte-Carlo Cross Validation method was employed to examine the predictive power of the proposed method by partitioning the data into train and test samples using different sampling splitting ratios. Application of the proposed method on the simulated data sets yielded high prediction accuracy on the test sample. Results from other performance indices further gave credence to the efficiency of the proposed method. The performance of the proposed method was compared with three of the state-of-the art machine learning methods including the standard SVM and the result showed the superiority of this method over others. Finally, the results generally show that the modified algorithm with Radial Basis Function (RBF) Kernel perform excellently and achieved the best predictive performance than any of the existing classifiers considered.Item Efficient Support Vector Machine Classification of Diffuse Large B-Cell Lymphoma and Follicular Lymphoma mRNA Tissue Samples(Faculty of Computer and Applied Computer Science, Tibiscus University of Timisoara, Romania., 2015) Banjoko, A. W.; Yahya, W. B.; Garba, M. K; Olaniran, O. R.; Dauda, K. A.; Olorede, K. O.In this study, an efficient Support Vector Machine (SVM) algorithm that incorporates feature selection procedure for efficient identification and selection of gene biomarkers that are predictive of Diffuse Large B–Cell Lymphoma (DLBCL) and Follicular Lymphoma (FL) cancer tumor samples is presented. The data employed were published real life microarray cancer data that contained 7,129 gene expression profiles measured on 77 biological samples that comprised 58 DLBCL and 19 FL tissue samples. The dimension reduction approach of the Welch statistic was employed at the feature selection phase of the SVM algorithm. The cost and kernel parameters of the SVM model were tuned over a 10–fold cross-validation to improve the efficiency of the SVM classifier. The entire sample was randomly partitioned into 95% training and 5% test samples. The SVM classifier was trained using Monte Carlo Crossvalidation approach with 1000 replications. The performance of this classifier was assessed on the test samples using misclassification error rate (MER) and other performance measures. The results showed that the SVM classifier is quite efficient by yielding very high prediction accuracy of the tumor samples with fewer differentially expressed genes. The selected gene biomarkers in this work can be subjected to further clinical screening for proper determination of their biological relationship with DLBCL and FL tumour subgroups. However, more studies with large samples might be needed in future to validate the results from this workItem IMPROVED BAYESIAN FEATURE SELECTION AND CLASSIFICATION METHODS USING BOOTSTRAP PRIOR TECHNIQUES(Faculty of Computer and Applied Computer Science, Tibiscus University of Timisoara, Romania, 2016) Olaniran, O. R.; Olaniran, S. F.; Yahya, W. B.; Banjoko, A. W.; Garba, M. K.; Amusa, L. B.; Gatta, N. F.In this paper, the behavior of feature selection algorithms using the traditional t-test, Bayesian t-test using MCMC and Bayesian two-sample test using proposed bootstrap prior technique were determined. In addition, we considered some frequentist classification methods like k- Nearest Neighbor (k-NN), Logistic Discriminant (LD), Linear discriminant analysis (LDA), Quadratic discriminant analysis (QDA) and Naïve Bayes when conditional independence assumption is violated. Two new Bayesian classifiers (B-LDA and B-QDA) were developed within the frame work of LDA and QDA using the bootstrap prior technique. The model parameters were estimated using Bayesian approach via the posterior distribution that involves normalizing the prior for the attributes and the likelihood from the sample in a MonteCarlo experiment. The bootstrap prior technique was incorporated into the Normal-Inverse-Wishart natural conjugate prior for the parameters of the multivariate normal distribution where the scale and location parameters were required. All the classifiers were implemented on the simulated data at 90:10 training-test data ratio. The efficiencies of these classifiers were assessed using the misclassification error rate, sensitivity, specificity, positive predictive value, negative predictive value and area under the ROC curve. Results from various analyses established the supremacy of the proposed Bayes classifiers (B-LDA and B-QDA) over the existing frequentists and Naïve Bayes classification methods considered. All these methods including the proposed one were implemented on a published binary response microarray data set to validate the results from the simulation studyItem Improved Bayesian Feature Selection and Classification Methods Using Bootstrap Prior Techniques(Faculty of Computer and Applied Computer Science, Tibiscus University of Timisoara, Romania, 2016) Olaniran, O. R.; Olaniran, S. F.; Yahya, W. B.; Banjoko, A. W.; Garba, M. K.; Amusa, L. B.; Gatta, N. F.In this paper, the behavior of feature selection algorithms using the traditional t-test, Bayesian t-test using MCMC and Bayesian two-sample test using proposed bootstrap prior technique were determined. In addition, we considered some frequentist classification methods like k- Nearest Neighbor (k-NN), Logistic Discriminant (LD), Linear discriminant analysis (LDA), Quadratic discriminant analysis (QDA) and Naïve Bayes when conditional independence assumption is violated. Two new Bayesian classifiers (B-LDA and B-QDA) were developed within the frame work of LDA and QDA using the bootstrap prior technique. The model parameters were estimated using Bayesian approach via the posterior distribution that involves normalizing the prior for the attributes and the likelihood from the sample in a MonteCarlo experiment. The bootstrap prior technique was incorporated into the Normal-Inverse-Wishart natural conjugate prior for the parameters of the multivariate normal distribution where the scale and location parameters were required. All the classifiers were implemented on the simulated data at 90:10 training-test data ratio. The efficiencies of these classifiers were assessed using the misclassification error rate, sensitivity, specificity, positive predictive value, negative predictive value and area under the ROC curve. Results from various analyses established the supremacy of the proposed Bayes classifiers (B-LDA and B-QDA) over the existing frequentists and Naïve Bayes classification methods considered. All these methods including the proposed one were implemented on a published binary response microarray data set to validate the results from the simulation studyItem Investigating the Effects of Multicollinearity on the Model Parameters of Ordinary Least Squares Estimator(Sretech Journal Publications, 2019) Gatta, N. F.; Banjoko, A. W.This study investigated the effects of multicollinearity on the model parameters of the ordinary least squares regression model. The aim was to examine the impacts of multicollinearity on the efficiency of classical Ordinary least squares (OLS). Data were simulated from a multivariate normal distribution with mean zero and variance-covariance matrix at various sample sizes 25, 50, 100, 200, 500 and 1000. To assess the asymptotic efficiency and consistency of the regression models in the presence of multicollinearity, the evaluation criteria used were the Variance, Absolute bias, Mean Square Error (MSE) and Mean Square Error of Prediction (MSEP). Results from the analysis revealed that the OLS is not efficient given the large MSE, MSEP, and Absolute bias.Item Investigating the Effects of Multicollinearity on the Model Parameters of Ordinary Least Squares Estimator(Sretech Journal Publications, 2019) Gatta, N. F.; Banjoko, A. W.This study investigated the effects of multicollinearity on the model parameters of the ordinary least squares regression model. The aim was to examine the impacts of multicollinearity on the efficiency of classical Ordinary least squares (OLS). Data were simulated from a multivariate normal distribution with mean zero and variance-covariance matrix at various sample sizes 25, 50, 100, 200, 500 and 1000. To assess the asymptotic efficiency and consistency of the regression models in the presence of multicollinearity, the evaluation criteria used were the Variance, Absolute bias, Mean Square Error (MSE) and Mean Square Error of Prediction (MSEP). Results from the analysis revealed that the OLS is not efficient given the large MSE, MSEP, and Absolute bias.Item Multiclass Feature Selection and Classification with Support Vector Machine in Genomic Study(Edited Conference Proceedings of the 1st International Conference of the Nigeria Statistical Society (NSS)., 2017) Banjoko, A. W.; Yahya, W. B.; Garba, M. K.; Olaniran, O. R.; Amusa, L. B.; Gatta, N. F.; Dauda, K. A.; Olorede, K. O.This study proposes an efficient Support Vector Machine (SVM) algorithm for feature selection and classification of multiclass response group in high dimensional (microarray) data. The Feature selection stage of the algorithm employed the F-statistic of the ANOVA–like testing scheme at some chosen family-wise-error-rate (FWER) to control for the detection of some false positive features. In a 10-fold cross validation, the hyper-parameters of the SVM were tuned to determine the appropriate kernel using one-versus-all approach. The entire simulated dataset was randomly partitioned into 95% training and 5% test sets with the SVM classifier built on the training sets while its prediction accuracy on the response class was assessed on the test sets over 1000 Monte-Carlo cross-validation (MCCV) runs. The classification results of the proposed classifier were assessed using the Misclassification Error Rates (MERs) and other performance indices. Results from the Monte-Carlo study showed that the proposed SVM classifier was quite efficient by yielding high prediction accuracy of the response groups with fewer differentially expressed features than when all the features were employed for classification. The performance of this new method on some published cancer data sets shall be examined vis-à-vis other state-of-the-earth machine learning methods in future works.Item Multiclass Feature Selection and Classification with Support Vector Machine in Genomic Study(Edited Conference Proceedings of the 1st International Conference of the Nigeria Statistical Society (NSS)., 2017) Banjoko, A. W.; Yahya, W. B.; Garba, M. K; Olaniran, O. R.; Amusa, L. B.; Gatta, N. F.; Dauda, K. A.; Olorede, K. O.This study proposes an efficient Support Vector Machine (SVM) algorithm for feature selection and classification of multiclass response group in high dimensional (microarray) data. The Feature selection stage of the algorithm employed the F-statistic of the ANOVA–like testing scheme at some chosen family-wise-error-rate (FWER) to control for the detection of some false positive features. In a 10-fold cross validation, the hyper-parameters of the SVM were tuned to determine the appropriate kernel using one-versus-all approach. The entire simulated dataset was randomly partitioned into 95% training and 5% test sets with the SVM classifier built on the training sets while its prediction accuracy on the response class was assessed on the test sets over 1000 Monte-Carlo cross-validation (MCCV) runs. The classification results of the proposed classifier were assessed using the Misclassification Error Rates (MERs) and other performance indices. Results from the Monte-Carlo study showed that the proposed SVM classifier was quite efficient by yielding high prediction accuracy of the response groups with fewer differentially expressed features than when all the features were employed for classification. The performance of this new method on some published cancer data sets shall be examined vis-à-vis other state-of-the-earth machine learning methods in future works.Item On the Strength of Agreement between Initial and Final Academic performances in a Nigerian University System(ABACUS, Published by Mathematical Association of Nigeria, 2018) Banjoko, A. W.; Yahya, W. B; Abiodun, H. S.; Afolayan, R. B.; Garba, M.K.; Olorede, K. O.; Dauda, K.A.; Adeleke, M. O.This paper examines the strength of agreement between academic performances of students after their first and final years in the University. Academic performances of a total of 886 students that were admitted into various academic programs in the Faculty of Science, University of Ilorin, during the 2008/2009 academic session were followed-up to their year of graduation in 2012. Information on the grade point average (GPA) of students at the end of their first year in 2008, their final cumulative grade point average (CGPA) at the end of their studies in 2012 among others were collected. Results from this study generally showed a fair agreement between students’ initial and final academic performances in Nigeria University system (p < 0.001). It was also found that about 50% of students maintained the classes of degrees they had in their first year till graduation,about 40% of them improved on their performances while the performances of about 7% of them dropped from what they had at their firstyear.Further results showed that students’ performance is gender sensitive.Specifically, about 45% and 60% of female and male students maintained the classes of degrees they had during their first year in the University, about 50% and 30% of them improved on theirs while about 5% and 10% of them dropped from their initial academic performances at the end of their studies respectively. Finally, students in the Biological Sciences improved on their initial academic performances more than their counterparts in the Physical Sciences. Also, female students improved on their initial academic performances more than their male counterparts. This work will serve as useful counselling guide to prospective admission seekers into the Universities and all the stakeholders at enhancing students’ academic performances in the University system.Item On the Strength of Agreement Between Students’ Initial and Final Academic Performances in Nigeria University System.(ABACUS, Mathematical Association of Nigeria, Nigeria, 2018) Banjoko, A. W.; Yahya, W. B.; Abiodun, H. S.; Adeleke, M. O; Afolayan, R. B; Garba, M. K.; Olorede, K. O.; Dauda, K. A.This paper examines the strength of agreement between academic performances of students after their first and final years in the University. Academic performances of a total of 886 students that were admitted into various academic programs in the Faculty of Science, University of Ilorin, during the 2008/2009 academic session were followed-up to their year of graduation in 2012. Information on the grade point average (GPA) of students at the end of their first year in 2008, their final cumulative grade point average (CGPA) at the end of their studies in 2012 among others were collected. Results from this study generally showed a fair agreement between students’ initial and final academic performances in Nigeria University system (p < 0.001). It was also found that about 50% of students maintained the classes of degrees they had in their first year till graduation, about 40% of them improved on their performances while the performances of about 7% of them dropped from what they had during their first year. Further results showed that students’ performance is gender sensitive. Specifically, about 45% and 60% of female and male students maintained the classes of degrees they had during their first year in the University, about 50% and 30% of them improved on theirs while about 5% and 10% of them dropped from their initial academic performances at the end of their studies respectively. Finally, students in the Biological Sciences improved on their initial academic performances more than their counterparts in the Physical Sciences. Also, female students improved on their initial academic performances more than their male counterparts. This work will serve as a useful counselling guide to prospective admission seekers into the Universities and all the stakeholders at enhancing students’ academic performances in the University system.Item Partial Least Squares-Based Classification and Selection of Predictive Variables of Crimes against Properties in Nigeria(Edited Conference Proceedings of the 1st International Conference of the Nigeria Statistical Society (NSS)., 2017) Olorede, K. O.; Yahya, W. B.; Garuba, A. O.; Banjoko, A. W.; Dauda, K. A.In this study, the state-of-the-art Partial Least Squares (PLS) based models (PLS-Discriminant analysis (PLS-DA), Sparse PLS-DA (SPLS-DA) and Sparse Generalized PLS (SGPLS)) were employed to model and classify the rate of crimes (low or high) committed against properties across the 36 states in Nigeria and the Federal Capital Territory (FCT). The core variables that are predictive of this crime type in Nigeria were identified using the LASSO penalty method via the PLS. Data on occurrences of cases of offences against property obtained from the data base of Nigerian Police Force were utilized in this study. The missing values due to non-occurrence or non-reportage of crime cases were imputed, using the techniques of multivariate imputation by chained equation. The complete data set were partitioned into training and test sets using 80:20 holdout scheme. The 80% training set was used to build the PLS-based models that were in turn used to predict the overall crime rates of Nigerian cities in the 20% held out test data over 200 Monte-Carlo cross-validation runs. All the PLS-based models yielded good classification of unseen test samples into either of two qualitative classes of high and low crime rates with average Correct Classification Rate (CCR) of 94%. Other performance metrics including sensitivity, specificity, positive and negative predictive values, balance accuracy and diagnostic odds ratio were estimated to further examine their classification efficiencies. The SGPLS identified fewer (just 3 out of 12) core relevant crime variables that are predictive of the overall crime rates in Nigerian states with highest CCR than the SPLS which selected 9 such variables to achieved about the same feat.Item Sequential Optimization Based Feature Selection Algorithm for Efficient Cancer Classification and Prediction(Proceedings of the 14th iSTEAMS International Multidisciplinary Conference, Al-Hikmah University, Ilorin, Nigeria, 2018) Banjoko, A. W.; Yahya, W. B.This study proposes an efficient method for optimal selection of feature subsets to enhance the classification performance of Support Vector Machine (SVM) in a binary and multiclass response high-dimensional genomic microarray data using Multi-Objective Optimization (MOO) approach. In a Monte-Carlo experiment, a pre-selection of the features was performed with the filter method based on Sidak alpha value to reduce the number of false positive features in the data. The optimal values of the tuning parameters for both the SVM cost and Radial Basis Function (RBF) kernel were determined by grid search in a 10–fold cross-validation. The SVM with RBF kernel was then fitted sequentially to select the set of near optimal genes that are correlated with the response class. The proposed algorithm was compared with the following four machine learning methods: Naïve Bayes (NB), Random Forest (RF), Random Forest with variable selection (RFVS) and LASSO. The Misclassification Error Rate (MER) of the proposed method on simulated data was 1.1% with a sensitivity of 97.8% using four (near) optimal selected genes. In contrast, the MERs of NB, RF, RFVS and LASSO classifiers with 10, 10, 9 and 37 genes were 4.28%, 5.03%, 4.98% and 0.00% respectively using the data. Application of the proposed method on published Leukaemia data yielded an MER of 0.03% with a sensitivity of 99.95% based on three (3) optimally selected genes. On the other hand, the MERs of NB, RF, RFVS and LASSO classifiers for the Leukaemia data were 1.0%, 3.0%, 5.67% and 0.00% based on 93, 93, 2 and 31 genes respectively. These same fits of performance were achieved by all the methods considered on multiclass response DNA data set. The results generally showed that the proposed algorithm is more parsimonious and achieved better predictive performance than some of the existing methods considered. The sets of optimally selected gene subsets in the data employed here can be further investigated by molecular biologist to establish the pathology of these genes with respect to their respective tumour classes.Item Structural Relationships of Exchange Rates of Naira to Some Foreign Currencies(Edited Conference Proceedings of the 1st International Conference of the Nigeria Statistical Society (NSS)., 2017) Garba, M. K; Yahya, W. B.; Babaita, H. T.; Banjoko, A. W.; Amobi, A. Q.This study investigates the existence of causality among exchange rates of Naira to three of the major foreign currencies (Euro, Pound Sterling and US Dollar). The work is aimed at determining the patterns of causalities that exist among these three foreign currencies to Nigerian Naira using multivariate time series modelling techniques. The data employed for this study were on daily exchange rates of Naira to Euro, Pound Sterling and US Dollar over a period of thirteen years beginning from 1st January 2002 to 31st December, 2014. The rates were national datasets extracted from the published statistical bulletin of the Central Bank of Nigeria. The Vector Autoregressive (VAR) model which is useful for describing the dynamic behavior of economic and financial time series was fitted to the data. The potential causal relationships among the three exchange rates using the Granger Causality tests were examined. Results revealed that the future exchange rates of Naira to Euro can be predicted by the past values of Naira to Euro and Naira to US Dollar. Finally, the exchange rates of Naira to Pound Sterling was granger caused by Naira to Euro and Naira to US Dollar exchange rates, and the rate of exchange of Naira to US Dollar was granger caused by the Naira to Euro exchange rates. Results from this work would assist the government, policy makers and other interested stakeholders to be familiar with the inherent relationship among the notable currencies to the Naira for efficient business decisions.Item Survival Analysis with Multivariate Adaptive Regression Splines using Cox-Snell Residual(Faculty of Computer and Applied Computer Science, Tibiscus University of Timisoara, Romania, 2015) Dauda, K. A.; Yahya, W. B.; Banjoko, A. W.Multivariate Adaptive Regression Splines (MARS) are a generalization of stepwise linear regression method that is often employed to improve the efficiency of regression models. It is a useful tool to identify linear/nonlinear and interactions effects between a set of metrical and categorical covariates in regression models. In this study, the use of a modified Cox-Snell Residuals to Survival Analysis with MARS was proposed. The proposed method was compared with Martingale Residual in the Survival MARS setting. These two residual types were used as responses in the Cox proportional hazard modeling in the MARS implementations. Results from simulation studies revealed that the proposed method fitted the data better than the Martingale residual However, further results from Monte-Carlo experiment showed that the two residual types performed better than the classical Cox Proportional Hazard (CPH) method. These methods were applied on real life dataset on Pneumocystis Carinii Pneumonia and all the results obtained actually validated those got from the simulation studiesItem Taguchi’s Loss Function Approach to Economic Design of Rectifying Single Sampling Plan by Variable with Asymmetric Quality Cost(University of Ilorin, Nigeria, 2013) Banjoko, A. W.; Oyeyemi, G. M.The quadratic Taguchi (quality) loss function has been given wide attention in the application of statistical (Economic) process control, which in most cases is always aimed at obtaining optimal setting for a process. This article is an extension of Chung-ho Chen (2005) and focuses on the use of Taguchi loss function with truncated normal probability density function in designing optimal inspection policy and economic specification limits for rectifying single sampling plan by variable with unequal costs (Cs ≠ Cw = C)Item Taguchi’s Loss Function Approach to Economic Design of Rectifying Single Sampling Plan by Variable with Asymmetric Quality Cost(Centre point Journal (Science Edition), University of Ilorin, 2013) Banjoko, A. W.; Oyeyemi, G. MThe quadratic Taguchi (quality) loss function has been given wide attention in the application of statistical (Economic) process control, which in most cases is always aimed at obtaining optimal setting for a process. This article is an extension of Chung-Ho Chen (2005) and focuses on the use of Taguchi loss function with truncated normal probability density function in designing optimal inspection policy and economic specification limits for rectifying single sampling plan by variable with unequal quality costsItem A Test Procedure for Ordered Hypothesis of Population Proportions Against a Control(Turkish Clinical publications, Turkey, 2016) Yahya, W. B.; Olaniran, O. R.; Garba, M. K.; Oloyede, I.; Banjoko, A. W.; Dauda, K. A.; Olorede, K. O.Objective: This paper aims to present a novel procedure for testing a set of population proportions against an ordered alternative with a control. Material and Methods: The distribution of the test statistic for the proposed test was determined theoretically and through Monte-Carlo experiments. The efficiency of the proposed test method was compared with the classical Chi-square test of homogeneity of population proportions using their empirical Type I error rates and powers at various sample sizes. Results: The new test statistic that was developed for testing a set of population proportions against an ordered alternative with a control was found to have a Chi-square distribution with non-integer values degrees of freedom v that depend on the number of population groups k being compared. Table of values of v for comparing up to 26 population groups was constructed while an expression was developed to determine v for cases where k > 26. Further results showed that the new test method is capable of detecting the superiority of a treatment, for instance a new drug type, over some of the existing ones in situations where only the qualitative data on users’ preferences of all the available treatments (drug types) are available. The new test method was found to be relatively more powerful and consistent at estimating the nominal Type I error rates (α), especially at smaller sample sizes than the classical Chi-square test of homogeneity of population proportions. Conclusion: The new test method proposed here could find applications in pharmacology where a newly developed drug might be expected to be more preferred by users than some of the existing ones. This kind of test problem can equally exist in medicine, engineering and humanities in situations where only the qualitative data on users’ preferences of a set of treatments or systems are available.Item The Trade-off between the PLSR and PCR Methods for Modeling Data with Collinear Structure(Nigerian Association of Mathematical Physics, 2017-01-20) Yahya, W. B.; Olorede, K. O.; Garba, M. K.; Banjoko, A. W.; Dauda, K. A.