Anomaly Detection in Dataset for Improved Model Accuracy Using DBSCAN Clustering Algorithm

dc.contributor.authorAjiboye, A.R.
dc.contributor.authorAkintola, A.G.
dc.contributor.authorAmeen, A.O.
dc.date.accessioned2018-12-03T13:24:34Z
dc.date.available2018-12-03T13:24:34Z
dc.date.issued2015-03
dc.descriptionMain articleen_US
dc.description.abstractThe purity of the dataset used for model construction plays important roles in the accuracy and reliability of model building; outliers are often caused by noisy data as a result of mechanical faults, changes in system behaviour, or due to human error. This is why it is essential to pre-process dataset prior to modelling, in order to differentiate between data that appears normal or abnormal within the sample space. One important reason for removing outliers is to prevent contaminating effect on the dataset which can lead to bad consequences and serious disaster if not removed. An effective measure that automatically clusters outliers in the dataset using Density-Based Spatial Clustering of Applications with Noise (DBSCAN) technique is proposed in this paper. Rapidminer, an open source software tool is used to experiment on some sample dataset and based on the characteristics of these data objects, some clusters are formed which filter out outliers from the dataset being explored. The experimental results from this study show that, the DBSCAN algorithm is a suitable technique for outliers detection and capable of filtering the abnormal data from a combination of noise and normal dataset.en_US
dc.identifier.citationAfrican Journal of Computing & ICTen_US
dc.identifier.issn2006-1781
dc.identifier.urihttp://hdl.handle.net/123456789/1335
dc.language.isoenen_US
dc.publisherIEEE Nigeria Chapter.en_US
dc.relation.ispartofseries;Vol 8. No. 1
dc.subjectAnomaly detectionen_US
dc.subjectDBSCANen_US
dc.subjectclusteringen_US
dc.subjectmodel-buildingen_US
dc.subjectalgorithmen_US
dc.subjectnoisy dataen_US
dc.titleAnomaly Detection in Dataset for Improved Model Accuracy Using DBSCAN Clustering Algorithmen_US
dc.typeArticleen_US

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
IEEE Journal_file4.pdf
Size:
169.97 KB
Format:
Adobe Portable Document Format
Description:
Article
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.69 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections