Stop Words Removal on Textual Data Classification
No Thumbnail Available
Date
2019-05
Journal Title
Journal ISSN
Volume Title
Publisher
Faculty of Communication and Information Sciences, University of Ilorin.
Abstract
Text data is highly voluminous and performing mining tasks on it can be daunting due to large memory usage, thus researchers have considered different techniques to reduce the data while still maintaining or increasing the level of accuracy. Stop word removal is one of the pre-processing techniques used in text data mining. This paper investigates the effect of stop words removal on the text data mining performance. The machine learning algorithms used are C4.5 Decision Tree and Multinomial Naïve Bayes (MNB) on two text datasets; Sentiment Analysis and SMS Spam dataset. Results revealed that the removal of stop words had no influence on the classification accuracy of text mining model, but actually reduced the level of confidence of the prediction
Description
Keywords
Sentiment Analysis, Stop Words Removal, Machine Learning, Text mining
Citation
Aro, T.O., Dada, F., Balogun, A.O. and Oluwasogo, S.A. (2019). Stop Words Removal on Textual Data Classification, International Journal of Information Processing and Communication (IJIPC), 7(1), 1-9.