Comparisons of Some Outlier Detection Methods in Linear Regression Model

No Thumbnail Available

Date

2017

Journal Title

Journal ISSN

Volume Title

Publisher

Faculty of Physical Sciences, University of Ilorin, Nigeria

Abstract

Empirical evidence suggests unusual or outlying observations in data sets are much more prevalent than one might expect and therefore this paper addresses multiple outliers in linear regression model. Although reliable for a single or a few outliers, standard diagnostic techniques from an Ordinary Least Squares (OLS) fit can fail to identify multiple outliers. The parameter estimates, diagnostic quantities and model inferences from the contaminated data set can be significantly different from those obtained with the clean data. A regression outlier is an observation that has an unusual value of the dependent variable Y, conditional on its value of the independent variable X. Four procedures for detecting outliers in linear regression were compared; the Cook’s, DFFITS, DFBETAS, and Mahalanobi’s distances. DFBETAS is most efficient in outlier detection for small sample and small percentage of outliers but has low sensitivity when the sample size is large. Mahalanobi has more power of detection of small percentage of outliers regardless of sample size.

Description

Keywords

Mahalanobis distance, Cooks' distance, Masking effect, DFBETAS, DFFITS

Citation

Ilorin Journal of Science

Collections