A novel approach to outliers removal in a noisy numeric dataset for efficient mining

dc.contributor.authorAjiboye, A. R.
dc.contributor.authorAdewole, K. S.
dc.contributor.authorBabatunde, R. S.
dc.contributor.authorOladipo, I. D.
dc.date.accessioned2017-11-23T14:03:50Z
dc.date.available2017-11-23T14:03:50Z
dc.date.issued2016
dc.description.abstractData pre-processing is a key task in the data mining process. The task generally consumes the largest portion of the total data engineering effort while unveiling useful patterns from datasets. Basically, data mining is about fitting descriptive or predictive models from data. However, the presence of outlier sometimes reduces the reliability of the models created. It is, therefore, essential to have raw data properly pre-processed before exploring them for mining. In this paper, an algorithm that detects and removes outliers in a numeric dataset is proposed. In order to establish the effectiveness of the proposed algorithm, the clean data obtained through the implementation of the proposed approach is used to create a prediction model. Similarly, the clean data obtained through the use of one of the existing techniques is also used to create a prediction model. Each of the models created is simulated using a set of untrained data and the error associated with each model is measured. The resulting outputs from the two approaches reveal that, the prediction model created using the output from the proposed algorithm has an error of 0.38, while the prediction model created using the cleaned data from the clustering method gives an error of 0.61. Comparison of the errors associated with the models created using the two approaches shows that, the proposed algorithm is suitable for cleaning numeric dataset. The results of the experiment also unveils that, the proposed approach is efficient and can be used as an alternative technique to other existing cleaning methods.en_US
dc.identifier.urihttp://hdl.handle.net/123456789/18
dc.language.isoenen_US
dc.publisherIlorin Journal of Computer Science and Information Technologyen_US
dc.subjectAlgorithm; Data mining; Data pre-processing; Outliers; Predictionen_US
dc.titleA novel approach to outliers removal in a noisy numeric dataset for efficient miningen_US
dc.typeArticleen_US

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Paper C.pdf
Size:
355.48 KB
Format:
Adobe Portable Document Format
Description:
Main article
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections