Heterogeneous ensemble with combined dimensionality reduction for social spam detection

Oladepo, Abdulfatai Ganiyu; Bajeh, Amos Orenyi; Balogun, Abdullateef Oluwagbemiga; Mojeed, Hammed Adeleye; Salman, Abdulsalam Abiodun; Bako, Abdullateef Iyanda

Heterogeneous ensemble with combined dimensionality reduction for social spam detection

dc.contributor.author	Oladepo, Abdulfatai Ganiyu
dc.contributor.author	Bajeh, Amos Orenyi
dc.contributor.author	Balogun, Abdullateef Oluwagbemiga
dc.contributor.author	Mojeed, Hammed Adeleye
dc.contributor.author	Salman, Abdulsalam Abiodun
dc.contributor.author	Bako, Abdullateef Iyanda
dc.date.accessioned	2022-11-30T10:49:44Z
dc.date.available	2022-11-30T10:49:44Z
dc.date.issued	2021
dc.description.abstract	Spamming is one of the challenging problems within social networks which involves spreading malicious or scam content on a network; this often leads to a huge loss in the value of real-time social network services, compromise the user and system reputation and jeopardize users trust in the system. Existing methods in spam detection still suffer from misclassification caused by redundant and irrelevant features in the dataset as a result of high dimensionality. This study presents a novel framework based on a heterogeneous ensemble method and a hybrid dimensionality reduction technique for spam detection in micro-blogging social networks. A hybrid of Information Gain (IG) and Principal Component Analysis (PCA) (dimensionality reduction) was implemented for the selection of important features and a heterogeneous ensemble consisting of Naïve Bayes (NB), K Nearest Neighbor (KNN), Logistic Regression (LR) and Repeated Incremental Pruning to Produce Error Reduction (RIPPER) classifiers based on Average of Probabilities (AOP) was used for spam detection. To empirically investigate its performance, the proposed framework was applied on MPI_SWS and SAC’13 Tip spam datasets and the developed models were evaluated based on accuracy, precision, recall, f-measure, and area under the curve (AUC). From the experimental results, the proposed framework (Ensemble + IG + PCA)outperformed other experimented methods on studied spam datasets. Specifically, the proposed framework had an average accuracy value of 87.5%, an average precision score of 0.877, an average recall value of 0.845, an average F-measure value of 0.872 and an average AUC value of 0.943. Also, the proposed framework had better performance than some existing approaches. Consequently, this study has shown that addressing high dimensionality in spam datasets, in this case, a hybrid of IG and PCA with a heterogeneous ensemble method can produce a more effective model for detecting spam contents.	en_US
dc.identifier.citation	Oladepo, A.G., Bajeh, A.O., Balogun, A.O., Mojeed, H.A., Salman, A.A. and Bako, A.I.(2021). Heterogeneous ensemble with combined dimensionality reduction for social spam detection. International Journal of Interactive Mobile Technologies (iJIM) 15 (17), 84-103	en_US
dc.identifier.uri	https://uilspace.unilorin.edu.ng/handle/20.500.12484/7984
dc.language.iso	en	en_US
dc.publisher	International Association of Online Engineering	en_US
dc.title	Heterogeneous ensemble with combined dimensionality reduction for social spam detection	en_US
dc.type	Article	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Heterogeneous Ensemble with Combined Dimensionality.pdf
Size:: 1.22 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Heterogeneous ensemble with combined dimensionality reduction for social spam detection

Files

Original bundle

License bundle

Collections