SEMI BAYESIAN INFERENCE HIGH AND LOW DIMENSIONAL DATA WITH MULTICOLLINEARITY

Oloyede, Isiaka (2018-03)

It is common knowledge that correlation among the features variables in linear regression(LR) will affect the precision of the estimates, possibly leading to parameters estimates that are artificially statistically insignificant. Barley (1980). This is as a result of inherent instability of inverting a near/non-singular matrix.

Article

It is generally known that correlation amongst features in high and low dimensional data lead to parameters that artificially insignificance. This study investigates asymptotic properties of some semi-bayesian estimators and compared it with non-bayesian estimator in the presence of multicollinearity. Variational and Empirical Bayes estimators were succinctly compared with ordinary least squares estimator using bias, mean squares error (MSE) and predictive mean squares error (PMSE). The number of iteration was 1000. In high dimensional data, it was found that empirical Bayes Linear Regression (EBLR) outperformed other estimators whereas OLS performed poorly using the PMSE as evaluation criterion. The study found out that in low dimensional data, variational Bayes Linear Regression (VBLR) outperformed other estimators yet OLS performed poorly using the PMSE criterion. Asymptotically, the three estimators were inconsistent but having the same pattern in low dimensional data but they were fairly consistent between the sample sizes 30 to 50 using the bias criterion. The study therefore concluded that empirical Bayes estimator should be adopted in high dimensional data while variational Bayes should be adopted in low dimensional data.

Collections: