DSpace Repository :: Browsing by Author "Sangaiah, A. K."

Browsing by Author "Sangaiah, A. K."

Now showing 1 - 3 of 3

Feature selection and computational optimization in high-dimensional microarray cancer datasets via InfoGain-modified bat algorithm
(Multimedia Tools and Applications, 2022) Hambali, M. A.; Oladele, Tinuke Omolewa; Adewole, K. S.; Sangaiah, A. K.; Gao, W.
Achieving a satisfactory cancer classification accuracy with the complete set of genes remains a great challenge, due to the high dimensions, small sample size, and presence of noise in gene expression data. Feature reduction is critical and sensitive in the classification task, most importantly in heterogeneous multimedia data. One of the major drawbacks in cancer study is recognizing informative genes from thousands of available genes in microarray data. Traditional feature selection algorithms have failed to scale on large space data like microarray data. Therefore, an effective feature selection algorithm is required to explore the most significant subset of genes by removing non-predictive genes from the dataset without compromising the accuracy of the classification algorithm. The study proposed an information Gain – Modified Bat Algorithm (InfoGain-MBA) features selection model for selecting relevant and informative features from high dimensional Microarray cancer datasets and evaluate the approach with four classifiers - C4.5, Decision Tree, Random Forest and classification and regression tree (CART). The results obtained show that the proposed approach is promising for the classification of microarray cancer data. The random forest has 100% accuracy with few genes in all seven datasets used. Further investigations were also conducted to determine the optimal threshold for each of the datasets.
Multi-objective scheduling of MapReduce jobs in big data processing
(Multimedia Tools and Applications, 2017) Hashem, I .A. T.; Anuar, N. B.; Marjani, M.; Gani, A.; Sangaiah, A. K.; Adewole, K. S.
Data generation has increased drastically over the past few years due to the rapid development of Internet-based technologies. This period has been called the big data era. Big data offer an emerging paradigm shift in data exploration and utilization. The MapReduce computational paradigm is a well-known framework and is considered the main enabler for the distributed and scalable processing of a large amount of data. However, despite recent efforts toward improving the performance of MapReduce, scheduling MapReduce jobs across multiple nodes has been considered a multi-objective optimization problem. This problem can become increasingly complex when virtualized clusters in cloud computing are used to execute a large number of tasks. This study aims to optimize MapReduce job scheduling based on the completion time and cost of cloud service models. First, the problem is formulated as a multi-objective model. The model consists of two objective functions, namely, (i) completion time and (ii) cost minimization. Second, a scheduling algorithm using earliest finish time scheduling that considers resource allocation and job scheduling in the cloud is proposed. Lastly, experimental results show that the proposed scheduler exhibits better performance than other well-known schedulers, such as FIFO and Fair.
SMSAD: a framework for spam message and spam account detection
(Multimedia Tools and Applications, 2017) Adewole, K. S.; Anuar, N. B.; Kamsin, A.; Sangaiah, A. K.
Short message communication media, such as mobile and microblogging social networks, have become attractive platforms for spammers to disseminate unsolicited contents. However, the traditional content-based methods for spam detection degraded in performance due to many factors. For instance, unlike the contents posted on social networks like Facebook and Renren, SMS and microblogging messages have limited size with the presence of many domain specific words, such as idioms and abbreviations. In addition, microblogging messages are very unstructured and noisy. These distinguished characteristics posed challenges to existing email spam detection models for effective spam identification in short message communication media. The state-of-the-art solutions for social spam accounts detection have faced different evasion tactics in the hands of intelligent spammers. In this paper, a unified framework is proposed for both spam message and spam account detection tasks. We utilized four datasets in this study, two of which are from SMS spam message domain and the remaining two from Twitter microblog. To identify a minimal number of features for spam account detection on Twitter, this paper studied bio-inspired evolutionary search method. Using evolutionary search algorithm, a compact model for spam account detection is proposed, which is incorporated in the machine learning phase of the unified framework. The results of the various experiments conducted indicate that the proposed framework is promising for detecting both spam message and spam account with a minimal number of features.

Browsing by Author "Sangaiah, A. K."

Results Per Page

Sort Options