Browsing by Author "Gatta, N. F."
Now showing 1 - 10 of 10
Results Per Page
Sort Options
Item Cox Survival Modeling of Neonatal Mortality in Nigeria(Nigeria Journal of Pure and Applied Sciences., 2016) Amusa, L. B.; Gatta, N. F.Nigeria continues to have one of the highest rates of neonatal deaths in Africa. This study investigates risk factors associated with neonatal deaths in Nigeria adopting the 2013 Nigeria Demographic and Health Survey (NDHS). Neonatal deaths of all singleton live-born infants were extracted from the 2013 NDHS. The 2013 NDHS was a multi-stage cluster sample survey of 38, 948 households. Of these households, complete cases of 1443 survival information were extracted including 879 cases of neonatal mortality. Using the traditional Cox Regression model, Significant factors that affected neonatal deaths were: mother’s level of education, residence type, size of the child, sex of the child, timing of breastfeeding initiation and number of antenatal visits. The study suggests that the Nigerian government needs to invest more in the healthcare system to ensure quality care for women and newborns to reduce avoidable neonatal deaths in Nigeria.Item IMPROVED BAYESIAN FEATURE SELECTION AND CLASSIFICATION METHODS USING BOOTSTRAP PRIOR TECHNIQUES(Faculty of Computer and Applied Computer Science, Tibiscus University of Timisoara, Romania, 2016) Olaniran, O. R.; Olaniran, S. F.; Yahya, W. B.; Banjoko, A. W.; Garba, M. K.; Amusa, L. B.; Gatta, N. F.In this paper, the behavior of feature selection algorithms using the traditional t-test, Bayesian t-test using MCMC and Bayesian two-sample test using proposed bootstrap prior technique were determined. In addition, we considered some frequentist classification methods like k- Nearest Neighbor (k-NN), Logistic Discriminant (LD), Linear discriminant analysis (LDA), Quadratic discriminant analysis (QDA) and Naïve Bayes when conditional independence assumption is violated. Two new Bayesian classifiers (B-LDA and B-QDA) were developed within the frame work of LDA and QDA using the bootstrap prior technique. The model parameters were estimated using Bayesian approach via the posterior distribution that involves normalizing the prior for the attributes and the likelihood from the sample in a MonteCarlo experiment. The bootstrap prior technique was incorporated into the Normal-Inverse-Wishart natural conjugate prior for the parameters of the multivariate normal distribution where the scale and location parameters were required. All the classifiers were implemented on the simulated data at 90:10 training-test data ratio. The efficiencies of these classifiers were assessed using the misclassification error rate, sensitivity, specificity, positive predictive value, negative predictive value and area under the ROC curve. Results from various analyses established the supremacy of the proposed Bayes classifiers (B-LDA and B-QDA) over the existing frequentists and Naïve Bayes classification methods considered. All these methods including the proposed one were implemented on a published binary response microarray data set to validate the results from the simulation studyItem Improved Bayesian Feature Selection and Classification Methods Using Bootstrap Prior Techniques(Faculty of Computer and Applied Computer Science, Tibiscus University of Timisoara, Romania, 2016) Olaniran, O. R.; Olaniran, S. F.; Yahya, W. B.; Banjoko, A. W.; Garba, M. K.; Amusa, L. B.; Gatta, N. F.In this paper, the behavior of feature selection algorithms using the traditional t-test, Bayesian t-test using MCMC and Bayesian two-sample test using proposed bootstrap prior technique were determined. In addition, we considered some frequentist classification methods like k- Nearest Neighbor (k-NN), Logistic Discriminant (LD), Linear discriminant analysis (LDA), Quadratic discriminant analysis (QDA) and Naïve Bayes when conditional independence assumption is violated. Two new Bayesian classifiers (B-LDA and B-QDA) were developed within the frame work of LDA and QDA using the bootstrap prior technique. The model parameters were estimated using Bayesian approach via the posterior distribution that involves normalizing the prior for the attributes and the likelihood from the sample in a MonteCarlo experiment. The bootstrap prior technique was incorporated into the Normal-Inverse-Wishart natural conjugate prior for the parameters of the multivariate normal distribution where the scale and location parameters were required. All the classifiers were implemented on the simulated data at 90:10 training-test data ratio. The efficiencies of these classifiers were assessed using the misclassification error rate, sensitivity, specificity, positive predictive value, negative predictive value and area under the ROC curve. Results from various analyses established the supremacy of the proposed Bayes classifiers (B-LDA and B-QDA) over the existing frequentists and Naïve Bayes classification methods considered. All these methods including the proposed one were implemented on a published binary response microarray data set to validate the results from the simulation studyItem Investigating the Effects of Multicollinearity on the Model Parameters of Ordinary Least Squares Estimator(Sretech Journal Publications, 2019) Gatta, N. F.; Banjoko, A. W.This study investigated the effects of multicollinearity on the model parameters of the ordinary least squares regression model. The aim was to examine the impacts of multicollinearity on the efficiency of classical Ordinary least squares (OLS). Data were simulated from a multivariate normal distribution with mean zero and variance-covariance matrix at various sample sizes 25, 50, 100, 200, 500 and 1000. To assess the asymptotic efficiency and consistency of the regression models in the presence of multicollinearity, the evaluation criteria used were the Variance, Absolute bias, Mean Square Error (MSE) and Mean Square Error of Prediction (MSEP). Results from the analysis revealed that the OLS is not efficient given the large MSE, MSEP, and Absolute bias.Item Investigating the Effects of Multicollinearity on the Model Parameters of Ordinary Least Squares Estimator(Sretech Journal Publications, 2019) Gatta, N. F.; Banjoko, A. W.This study investigated the effects of multicollinearity on the model parameters of the ordinary least squares regression model. The aim was to examine the impacts of multicollinearity on the efficiency of classical Ordinary least squares (OLS). Data were simulated from a multivariate normal distribution with mean zero and variance-covariance matrix at various sample sizes 25, 50, 100, 200, 500 and 1000. To assess the asymptotic efficiency and consistency of the regression models in the presence of multicollinearity, the evaluation criteria used were the Variance, Absolute bias, Mean Square Error (MSE) and Mean Square Error of Prediction (MSEP). Results from the analysis revealed that the OLS is not efficient given the large MSE, MSEP, and Absolute bias.Item Multiclass Feature Selection and Classification with Support Vector Machine in Genomic Study(Edited Conference Proceedings of the 1st International Conference of the Nigeria Statistical Society (NSS)., 2017) Banjoko, A. W.; Yahya, W. B.; Garba, M. K; Olaniran, O. R.; Amusa, L. B.; Gatta, N. F.; Dauda, K. A.; Olorede, K. O.This study proposes an efficient Support Vector Machine (SVM) algorithm for feature selection and classification of multiclass response group in high dimensional (microarray) data. The Feature selection stage of the algorithm employed the F-statistic of the ANOVA–like testing scheme at some chosen family-wise-error-rate (FWER) to control for the detection of some false positive features. In a 10-fold cross validation, the hyper-parameters of the SVM were tuned to determine the appropriate kernel using one-versus-all approach. The entire simulated dataset was randomly partitioned into 95% training and 5% test sets with the SVM classifier built on the training sets while its prediction accuracy on the response class was assessed on the test sets over 1000 Monte-Carlo cross-validation (MCCV) runs. The classification results of the proposed classifier were assessed using the Misclassification Error Rates (MERs) and other performance indices. Results from the Monte-Carlo study showed that the proposed SVM classifier was quite efficient by yielding high prediction accuracy of the response groups with fewer differentially expressed features than when all the features were employed for classification. The performance of this new method on some published cancer data sets shall be examined vis-à-vis other state-of-the-earth machine learning methods in future works.Item Multiclass Feature Selection and Classification with Support Vector Machine in Genomic Study(Edited Conference Proceedings of the 1st International Conference of the Nigeria Statistical Society (NSS)., 2017) Banjoko, A. W.; Yahya, W. B.; Garba, M. K.; Olaniran, O. R.; Amusa, L. B.; Gatta, N. F.; Dauda, K. A.; Olorede, K. O.This study proposes an efficient Support Vector Machine (SVM) algorithm for feature selection and classification of multiclass response group in high dimensional (microarray) data. The Feature selection stage of the algorithm employed the F-statistic of the ANOVA–like testing scheme at some chosen family-wise-error-rate (FWER) to control for the detection of some false positive features. In a 10-fold cross validation, the hyper-parameters of the SVM were tuned to determine the appropriate kernel using one-versus-all approach. The entire simulated dataset was randomly partitioned into 95% training and 5% test sets with the SVM classifier built on the training sets while its prediction accuracy on the response class was assessed on the test sets over 1000 Monte-Carlo cross-validation (MCCV) runs. The classification results of the proposed classifier were assessed using the Misclassification Error Rates (MERs) and other performance indices. Results from the Monte-Carlo study showed that the proposed SVM classifier was quite efficient by yielding high prediction accuracy of the response groups with fewer differentially expressed features than when all the features were employed for classification. The performance of this new method on some published cancer data sets shall be examined vis-à-vis other state-of-the-earth machine learning methods in future works.Item On the Approximation of Pareto Distribution to Exponential Distribution Using the Gini Coefficient of Inequality(Edited Conference Proceedings of the 1st International Conference of the Nigeria Statistical Society (NSS)., 2017) Yahya, W. B.; Garba, M. K.; Amidu, L.; Olorede, K. O.; Gatta, N. F.; Amusa, L. B.Pareto proposed that income and wealth distribution obeys a universal power law valid for all times and countries, but subsequent studies have often disputed this position. Some even argued there is indeed no Pareto Law and that it should be entirely discarded in studies on distribution of wealth or resources. Many other probability distributions have been proposed such as log normal, exponential, gamma and two other forms by Pareto himself. Using data on imported goods from the National Bureau of Statistics as a case of distribution of wealth in Nigeria, we demonstrated that the distribution of money spent on importation in Nigeria also follow exponential distribution using the Gini coefficient which is a measure of inequality (degree of concentration) of a variable in the distribution of resources. Simulation studies were carried out at different sizes of items (or households) and varying values of the shape parameter and we compare how close the Gini coefficients of the exponential distribution approximate those obtained from the Pareto data as a credible alternative to Pareto distribution.Item Performance Evaluation of Some Estimators of Linear Models with Collinearity and Non–Gaussian Error(Edited Conference Proceedings of the 1st International Conference of the Nigeria Statistical Society (NSS)., 2017) Yahya, W. B.; Garba, M. K.; Ajayi, A. G.; Dauda, K. A.; Olaniran, O. R.; Gatta, N. F.Among typical challenges in numerous multiple linear regression models are those of multicollinearity and non–normal disturbances which have created undesirable consequences for the ordinary least squares (OLS) estimator which is the popular and naïve technique for estimating linear models. Thus, it appears so critical to combine strategies for estimating regression models in order to muddle through while these challenges are present. In this study, the strength of some methods of estimating classical linear regression model in the presence of multicollinearity and non-normal error structures were investigated. The conventional Least Squares (LS), Ridge Regression (RR), Weighted Ridge (WR), Robust M-estimation (M) and Robust Ridge Regression (RRR) methods taking into accounts M-estimation procedures were considered in this study. Results from Monte-Carlo study revealed the superiority of the RRR estimator over others using Mean Squared Errors (MSE) of parameter estimates and Absolute Bias (AB) as assessment criteria among others over various considerations for the distribution of the disturbance term and levels of multicollinearity. The study concluded that whenever linear regression modeling is intended and multicollinearity among the regressors and non-spherical disturbance structure on the response variable are suspected in a data set, the RRR estimator should be adopted in order to ensure optimal efficiency.Item Robust Regression Methods for Solving Non-Spherical Problem in Linear Regression(Sretech Journal Publications, 2019) Gatta, N. F.; Yahya, W. B.; Garba, M. K.This study investigated the effects of non-spherical disturbance on the model parameters of some classical regression models. The aim was to examine the impacts of multicollinearity on the efficiency of classical Ordinary least squares (OLS) relative to the ridge regression (RR) and principal component regression (PCR) models. Data were simulated from a multivariate normal distribution with mean zero and variance-covariance matrix at various sample sizes 25, 50, 100, 200, 500 and 1000. To assess the asymptotic efficiency and consistency of these regression models in the presence of multicollinearity, the evaluation criteria used were the Variance, Absolute bias, Mean Square Error (MSE) and Mean Square Error of Prediction (MSEP). Results from this work showed that the RR model had smaller variance, absolute bias and MSE when it was compared with OLS. Also, the ridge estimator had the least MSEP when compared to both the OLS and PCR models. Hence, it can be concluded that the ridge estimator performed better than the OLS and PCR when explanatory variables are highly correlated