Comparison of Outlier Detection Procedures in Multiple Linear Regressions

dc.contributor.authorOyeyemi, G. M.
dc.contributor.authorBukoye, A.
dc.contributor.authorAkeyede, I.
dc.date.accessioned2023-07-27T08:43:49Z
dc.date.available2023-07-27T08:43:49Z
dc.date.issued2015
dc.description.abstractRegression analysis has become one of most widely used statistical tools for analyzing multifactor data. It is appealing because it provides a conceptually simple method for investigating functional relationship among variables. A relationship is expressed in the form of an equation or a model connecting the response or dependent variable and one or more explanatory or predictor variables. The major problem that statisticians have been confronted with, while dealing with regression analysis, is presence of outliers in data. An outlier is an observation that lies outside the overall pattern of a distribution. In other words it is a point which falls more than 1.5 times the interquartile range above the third quartile or below the first quartile. Several statistics are available to detect whether or not outlier(s) are present in data. Therefore, in this study, a simulation study was conducted to investigate the performance of Deffits, Cooks distance and Mahalanobis distance at different proportion of outliers (10%, 20% and 30%) and for various sample sizes (10, 30 and 100) in first, second or both independent variables. The data were generated using R software from normal distribution while the outliers were from uniform distribution. Findings: For small and medium sample sizes and at 10% level of outliers, Mahalanobis distance should be employed for her accuracy of detection of outliers. For small, medium and large sample size with higher percentage of outliers, Deffits should be employed. For small, medium and large sample sizes, Deffits should be used in detecting outlier signal irrespective of the percentage levels of outliers in the data set. For small sample and low percent of outliers Mahalanobis distance should be employed for easy computation.en_US
dc.description.sponsorshipSelf-sponsoreden_US
dc.identifier.citationAmerican Journal of Mathematics and Statisticsen_US
dc.identifier.urihttps://uilspace.unilorin.edu.ng/handle/20.500.12484/11613
dc.language.isoenen_US
dc.publisherScientific and Academic Publishingen_US
dc.relation.ispartofseries5(1);34 - 41
dc.subjectOutliers, Linear regression, Simulation, Probabilityen_US
dc.titleComparison of Outlier Detection Procedures in Multiple Linear Regressionsen_US
dc.typeArticleen_US

Files

License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections