SEMI BAYESIAN INFERENCE HIGH AND LOW DIMENSIONAL DATA WITH MULTICOLLINEARITY
No Thumbnail Available
Date
2018-03
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Journal of Science, Technology, Mathematics and Education (JOSTMED), Federal University of Technology, Minna
Abstract
It is generally known that correlation amongst features in high and low dimensional data lead to
parameters that artificially insignificance. This study investigates asymptotic properties of some
semi-bayesian estimators and compared it with non-bayesian estimator in the presence of
multicollinearity. Variational and Empirical Bayes estimators were succinctly compared with
ordinary least squares estimator using bias, mean squares error (MSE) and predictive mean
squares error (PMSE). The number of iteration was 1000. In high dimensional data, it was
found that empirical Bayes Linear Regression (EBLR) outperformed other estimators whereas
OLS performed poorly using the PMSE as evaluation criterion. The study found out that in low
dimensional data, variational Bayes Linear Regression (VBLR) outperformed other estimators
yet OLS performed poorly using the PMSE criterion. Asymptotically, the three estimators were
inconsistent but having the same pattern in low dimensional data but they were fairly consistent
between the sample sizes 30 to 50 using the bias criterion. The study therefore concluded that
empirical Bayes estimator should be adopted in high dimensional data while variational Bayes
should be adopted in low dimensional data.
Description
It is common knowledge that correlation among the features variables in linear regression(LR)
will affect the precision of the estimates, possibly leading to parameters estimates that are
artificially statistically insignificant. Barley (1980). This is as a result of inherent instability of
inverting a near/non-singular matrix.
Keywords
variattional bayes,, Empirical bayes,, High, Low dimensional data