Heterogeneous Ensemble Methods Based On Filter Feature Selection

dc.contributor.authorAmeen, Ahmed Oloduowo
dc.contributor.authorBalogun, Abdullateef Oluwagbemiga
dc.contributor.authorUsman, Ganiyat
dc.contributor.authorFashoto, Gbenga Stephen
dc.date.accessioned2018-05-10T13:39:24Z
dc.date.available2018-05-10T13:39:24Z
dc.date.issued2016
dc.description.abstractWhile certain computationally expensive novel methods can construct predictive models with high accuracy from high dimensional data, it is still of interest in many applications to reduce the dimension of the original data prior to any modeling of the data. Hence, this research presents a précis of ensemble methods (Stacking, Voting and Multischeme) and Multilayer perceptron, K Nearest Neighbour and NBTree with a framework on the performance measurement of base classifiers and ensemble methods with and without feature selection techniques (Principal Component Analysis, Information Gain Attribute Selection and Gain Ratio Attribute Selection). The enhancement is based on performing feature selection on dataset prior to classification. The notion of this study is to evaluate the performances of the ensemble methods on original and reduced datasets. A 10-fold cross validation technique is used for the performance evaluation of the ensemble methods and base classifiers (Root to Local) R2L KDD cup 1999 dataset and UCI Vote dataset using Waikato environment for knowledge analysis (WEKA) tool. The experiment revealed that the reduced dataset yielded improved results than the full dataset after using the ensemble methods based on stacking, voting and multischeme. On the R2L dataset, Multischeme ensemble method gave accuracy of 98.76% with PCA as feature selection on R2L dataset while 98.58% accuracy was given without feature selection. Using the gain ratio attribute selection, the Multischeme gave 98.93% accuracy over 98.76% without feature selection while using information gain attribute selection gave accuracy 98.85% over 98.76% without feature selection. For the Vote Dataset, Multischeme ensemble method proved best with an accuracy of 92.18% with PCA feature selection over 89.88% without feature selection, 95.40% accuracy with information gain as feature selection over 93.10% without feature selection and 95.40% accuracy with gain ratio as feature selection over 93.10% without feature selection. In arguably, it can be concluded that ensemble methods works well with feature selection.en_US
dc.identifier.citationAmeen A. O., Balogun A. O., Usman G. & Fashoto, S.G. (2016): Heterogenous Ensemble Methods Based On Filter Feature Selection. Computing, Information System Development Informatics & Allied Research Journals. Vol 7 No 4. Pp 63-78en_US
dc.identifier.issn2167-1710
dc.identifier.urihttp://hdl.handle.net/123456789/231
dc.language.isoenen_US
dc.publisherResearch Nexus Africa’s Networks in Conjunction with The African Institute of Development Informatics & Policy (AIDIP) Ghana & The International Centre for Information Technology & Development (ICITD), USAen_US
dc.relation.ispartofseriesVolume: 7;Issue: 4
dc.subjectMachine Learningen_US
dc.subjectData Miningen_US
dc.subjectEnsemble Methodsen_US
dc.subjectFeature Selectionen_US
dc.titleHeterogeneous Ensemble Methods Based On Filter Feature Selectionen_US
dc.typeArticleen_US

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
heterogeneous.pdf
Size:
332.02 KB
Format:
Adobe Portable Document Format
Description:
Main article
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.69 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections