Browsing by Author "Ajiboye, A.R."
Now showing 1 - 14 of 14
Results Per Page
Sort Options
Item Anomaly Detection in Dataset for Improved Model Accuracy Using DBSCAN Clustering Algorithm(IEEE Nigeria Chapter., 2015-03) Ajiboye, A.R.; Akintola, A.G.; Ameen, A.O.The purity of the dataset used for model construction plays important roles in the accuracy and reliability of model building; outliers are often caused by noisy data as a result of mechanical faults, changes in system behaviour, or due to human error. This is why it is essential to pre-process dataset prior to modelling, in order to differentiate between data that appears normal or abnormal within the sample space. One important reason for removing outliers is to prevent contaminating effect on the dataset which can lead to bad consequences and serious disaster if not removed. An effective measure that automatically clusters outliers in the dataset using Density-Based Spatial Clustering of Applications with Noise (DBSCAN) technique is proposed in this paper. Rapidminer, an open source software tool is used to experiment on some sample dataset and based on the characteristics of these data objects, some clusters are formed which filter out outliers from the dataset being explored. The experimental results from this study show that, the DBSCAN algorithm is a suitable technique for outliers detection and capable of filtering the abnormal data from a combination of noise and normal dataset.Item Cluster Analysis of Data Points using Partitioning and Probabilistic Model-based Algorithms(Foundation of Computer Science, 2014-08) Ajiboye, A.R.; Isah-Kebbe, H.; Oladele, T.O.Exploring the dataset features through the application of clustering algorithms is a viable means by which the conceptual description of such data can be revealed for better understanding, grouping and decision making. Some clustering algorithms, especially those that are partitioned-based, clusters any data presented to them even if similar features do not present. This study explores the performance accuracies of partitioning-based algorithms and probabilistic model-based algorithm. Experiments were conducted using k-means, k-medoids and EM-algorithm. The study implements each algorithm using RapidMiner Software and the results generated was validated for correctness in accordance to the concept of external criteria method. The clusters formed revealed the capability and drawbacks of each algorithm on the data points.Item Comparative Analysis of Predictive Models Created Based on Some Multi-layered Neural Networks Models(Dept. of Computer Science, LAUTECH Ogbomoso, 2018) Ajiboye, A.R.; Mabayoje, M.A.; Adewole, K.S.Multilayer back-propagation neural networks are the network structures build with distinct layers. Due to some inherent challenges of single-layered network in solving some nonlinear problems, it is desirable to have hidden layer(s) with adequate number of neurons; as this would give more processing power to the network. The main objective of this study was to unveil the architecture that can produce most accurate predictive model among the multilayer back-propagation neural networks that is being considered. This study specifically measures and compares the accuracies of the models created using the feed-forward back-propagation, cascade-forward back-propagation and Elman back-propagation neural networks. The required data sets for implementation were retrieved from the online public repositories. Experiments were conducted and repeated using two sets of different data in order to establish the consistencies of the network outputs. Findings from this study are based on a number metrics, and the results show that, among the three architectures being considered, the predictive model created using the feed-forward neural network architecture records the lowest error and found to converge at the lowest epoch.Item Comparative Approach of Back-Propagation Neural Network and Decision Tree on Breast Cancer Classification: An Appraisal(Dept. of Computer Science, LAUTECH Ogbomoso, 2019) Babatunde, R.S.; Adewole, K.S.; Ajiboye, A.R.The use of data mining methods in incorporating decision making has been increasing in the past decades. Data mining simply refers to extracting or mining knowledge from large amount of data. Over the years, medical image processing has benefited immensely from data mining techniques including breast cancer diagnosis. Sonography (also known as ultrasound) has become a great addition to mammography and magnetic resonance imaging (MRI) as imaging techniques dedicated to providing breast cancer screening. This technique is time-consuming and often characterized with low accuracy. Hence, the need to develop a robust classification model with high performance accuracy and reduced false alarm. In this paper, the performance of back propagation neural networks (BPNN) and C4.5 decision tree (DT) for breast cancer prediction was carried out. Filter based feature selection approach using correlation filter was employed for ranking features according to their predictive power. The model was simulated using WEKA data mining tools and extensive comparative study was performed based on the standard evaluation metrics. The performance of the two classifiers was compared based on their predictive accuracy, precision, recall, kappa statistic and other relevant statistical measures. The simulation results shows that C4.5 outperforms BPNN in terms of training time (0.16 secs) and accuracy (94.2857% ) while BPNN has 46.9secs training time and accuracy of 90.9524%. However, the result also reveals that BPNN outperforms C4.5 in terms of error rate, with BPNN having mean absolute error of 0.0542 while C4.5 has mean absolute error of 0.0834. It can therefore be deduced from the comparison that C4.5 can be a good option for prediction task considering the fast training time of the algorithm as well as the high accuracy of prediction.Item COMPARING THE PERFORMANCE OF PREDICTIVE MODELS CONSTRUCTED USING THE TECHNIQUES OF FEED-FORWARD AND GENERALIZED REGRESSION NEURAL NETWORKS(Universiti Malaysia Pahang, 2016-02) Ajiboye, A.R.; Abdullah-Arshah, R.; Honqwu, Q.; Abdul-Hadi, J.Construction of predictive model is primarily aimed at using the known attributes to determine the present or the future unknown attributes for efficient planning and decision making. The accuracy of predictive model is therefore, paramount to achieving network outputs that are well correlated with the known or target output. In this paper, two predictive models are constructed using the techniques of feed-forward and generalized regression neural networks. Experiments are conducted with a Matlab software and the performance of the two models is evaluated for accuracy. Their simulated outputs are compared to determine their response to untrained data. Findings from this study show that, the generalized regression neural network consistently shows a more accurate result. The Mean Absolute Error computed for the two models also reveals that, feed-forward neural network records higher error value.Item EVALUATING THE EFFECT OF DATASET SIZE ON PREDICTIVE MODEL USING SUPERVISED LEARNING TECHNIQUE(Universiti Malaysia Pahang, 2015-02) Ajiboye, A.R.; Abdullah-Arshah, R.; Hongwu, Q.Learning models used for prediction purposes are mostly developed without paying much cognizance to the size of datasets that can produce models of high accuracy and better generalization. Although, the general believe is that, large dataset is needed to construct a predictive learning model. To describe a data set as large in size, perhaps, is circumstance dependent, thus, what constitutes a dataset to be considered as being big or small is vague. In this paper, the ability of the predictive model to generalize with respect to a particular size of data when simulated with new untrained input is examined. The study experiments on three different sizes of data using Matlab program to create predictive models with a view to establishing if the size of data has any effect on the accuracy of a model. The simulated output of each model is measured using the Mean Absolute Error (MAE) and comparisons are made. Findings from this study reveals that, the quantity of data partitioned for the purpose of training must be of good representation of the entire sets and sufficient enough to span through the input space. The results of simulating the three network models also shows that, the learning model with the largest size of training sets appears to be the most accurate and consistently delivers a much better and stable results.Item Frequent Pattern and Association Rule Mining from Inventory Database Using Apriori Algorithm(African Journal of Computing & ICT, 2014-09) Adewole, K.S.; Akintola, A.G.; Ajiboye, A.R.Recently, data mining has attracted a great deal of attention in the information industry and in a Society where data continue to grow on a daily basis. The availability of huge amounts of data and the imminent need for turning such data into useful information and knowledge is the major focus of data mining. The information and knowledge obtained from large data can be used for applications ranging from market analysis, fraud detection, production control, customer retention, and science exploration. A record in such data typically consists of the transaction date and the items bought in the transaction. Successful organizations view such databases as important pieces of the marketing infrastructure. This paper considers the problem of mining association rules between items in a large database of sales transactions in order to understand customer-buying habits for the purpose of improving sales. Apriori algorithm was used for generating strong rules from inventory database. It was found that for a transactional database where many transaction items are repeated many times as a superset in that type of database, Apriori is suited for mining frequent itemsets. The algorithm was implemented using PHP, and MySQL database management system was used for storing the inventory data. The algorithm produces frequent itemsets completely and generates the accurate strong rules.Item An Improved Technique for the Removal and Replacement of the Inconsistencies in Numeric Dataset(IEEE Nigeria Chapter., 2015-05) Abdul-Hadi, J.; Ajiboye, A.R.; Abba, A.The task of ensuring the removal of anomalies in an unclean numeric dataset, with a view to putting the data in a suitable format for exploration purposes is a major phase in the data mining process. In the process of exploring an unclean numeric dataset to unveil their useful patterns or structure, a thorough pre-processing task is inevitable in order to achieve a noise-free dataset. Poor quality data can be misleading if analysed or used to build models, hence, there is need to remove discrepancies that may be present in the data prior to exploring them. In this paper, a cleaning algorithm is proposed and implemented in order to remove the inconsistencies in a numeric dataset. The implementation of the proposed algorithm uses the Java language and the resulting outputs reveal the efficiency of the proposed approach. In order to evaluate the effectiveness of the proposed algorithm, it is compared to one of the existing methods based on some metrics. The comparisons show that, the proposed technique is efficient and can be used as an alternative technique for the removal of outliers in numeric data. This approach is also found to be reliable as it consistently gives an accurate output that is free of outliers.Item INVESTIGATING THE EFFECT OF DATA NORMALIZATION ON PREDICTIVE MODELS(Faculty of Communication and Information Sciences, 2017) Ajiboye, A.R.; Ajiboye, I.K.; Salihu, S.A.; Tomori, R.A.The creation of predictive model using a supervised learning approach involves the task of building a model of the target variable as a function of the explanatory variables. Before a model is created, it is necessary to put the data in a suitable format. Studies have shown that normalization of data is crucial to descriptive mining as it improve the accuracy and efficiency of mining algorithms. However, in the case of prediction, it is not in all cases that predictive models are created from normalized data. This paper presents the experimental results of investigating the effect of normalizing the input variables on models created for prediction purposes. Experiments are conducted for the creation of predictive models from two different sets of equal size of data using neural network techniques. The trained network models created with the same architecture and configurations are subsequently simulated using a set of untrained data. The evaluation results and the comparison of the models created through the two data sets of different format reveals that, the model created from a normalized data appears to be more accurate as a decrease in error by 0.003 are consistently recorded. The model also converges much earlier than the model created from the data that does not undergo any form of normalization.Item A NOVEL APPROACH TO OUTLIERS REMOVAL IN A NOISY NUMERIC DATA SET FOR EFFICIENT MINING(Department of Computer Science, University of Ilorin., 2016) Ajiboye, A.R.; Adewole, K.S.; Babatunde, R.S.; Oladipo, I.D.Data pre-processing is a key task in the data mining process. The task generally consumes the largest portion of the total data engineering effort while unveiling useful patterns from datasets. Basically, data mining is about fitting descriptive or predictive models from data. However, the presence of outlier sometimes reduces the reliability of the models created. It is, therefore, essential to have raw data properly pre-processed before exploring them for mining. In this paper, an algorithm that detects and removes outliers in a numeric dataset is proposed. In order to establish the effectiveness of the proposed algorithm, the clean data obtained through the implementation of the proposed approach is used to create a prediction model. Similarly, the clean data obtained through the use of one of the existing techniques is also used to create a prediction model. Each of the models created is simulated using a set of untrained data and the error associated with each model is measured. The resulting outputs from the two approaches reveal that, the prediction model created using the output from the proposed algorithm has an error of 0.38, while the prediction model created using the cleaned data from the clustering method gives an error of 0.61. Comparison of the errors associated with the models created using the two approaches shows that, the proposed algorithm is suitable for cleaning numeric dataset. The results of the experiment also unveils that, the proposed approach is efficient and can be used as an alternative technique to other existing cleaning methods.Item Risk Status Prediction and Modelling Of Students’ Academic Achievement - A Fuzzy Logic Approach(2013-11) Ajiboye, A.R.; Abdullah-Arshah, R.; Honqwu, Q.Several students usually fall victims of low grade point at the end of their first year in the institution of higher learning and some were even withdrawn due to their unacceptable grade point average (GPA); this could be prevented if necessary measures were taken at the appropriate time. In this paper, a model using fuzzy logic approach to predict the risk status of students based on some predictive factors is proposed. Some basic information that has some correlations with students’ academic achievement and other predictive variables were modelled, the simulated model shows some degree of risk associated with their past academic achievement. The result of this study would enable the teacher to pay more attention to student’s weaknesses and could also help school management in decision making, especially for the purpose of giving scholarship to talented students whose risk of failure was found to be very low; while students identified as having high risk of failure, could be counselled and motivated with a view to improving their learning ability.Item TSDL: A Framework for Tip Spam Detection in Location Based Social Network.(Nigeria Computer Society, 2017) Adewole, K.S.; Isiaka, R.M.; Jimoh, R.G.; Ajiboye, A.R.In Web 2.0 systems, Location Based Social Network (LBSN) has become increasingly popular because of its ability for locating places, such as restaurants, hotels and nearby facilities that can render services to the users. Among such LBSN is Apontador, a popular social network introduced in Brazil a couple of years ago. However, despite the numerous opportunities offered by Apontador LBSN, it has become attractive social network for tips spam distribution. In this work, an intelligent machine learning framework is proposed to detect tips spam in Apontador LBSN. The proposed framework explored three classification algorithms: Random Forest, Multilayer Perceptron, and Decorate. Bio-inspired feature identification was studied using Evolutionary algorithm (EA) and Particle Swamp Optimization (PSO) to identify the discriminating features for tip spam detection in LBSN. Three experiments were conducted based on sixty (60), twenty-two (22), and ten (10) features to ascertain the performance of the proposed framework for tip spam detection. Based on the various experiments conducted, Random Forest classification algorithm produced accuracy and F-measure of 90.5% and ROC of 96.7% using 60 features extracted from Apontador LBSN. The algorithm also outperformed the two other classifiers during EA evaluation. However, Decorate classifier produced the best results during the PSO evaluation, achieving F-measure and ROC of 86.3% and 92.9% respectively. The experimental results show that the proposed framework improved in performance for tips spam detection in Apontador LBSN.Item Using an Enhanced Feed-Forward Neural Network Technique for Prediction of Students' Performance(International Scientific Academy of Engineering & Technology., 2015-05) Ajiboye, A.R.; Abdullah-Arshah, R.; Honqwu, Q.The newly admitted students for the undergraduate programmes in the institutions of higher learning sometimes experience some academic adjustment that is associated with stress; many factors have been attributed to this, which most times, results in the high percentage of failure and low Grade Point Average (GPA). Computing the earlier academic achievements for these sets of students would make one to be abreast of their level of knowledge academically, in order to be well-informed of their areas of weakness and strength. In this paper, an enhancement of Feed-forward Neural Network for the creation of a network model to predict the students' performance based on their historical data is proposed. In the course of experimentations with Matlab software, two network models are created using the existing and enhanced feed-forward neural network techniques. The ability of these models to generalize is measured using simulation methods. The enhanced nefwork model consistently shows a high degree of accuracy and predicts well. The performance of students predicted as outstanding, can also be supported financially in the form of scholarship; while those that are found to be academically weak can be encouraged and rightly counseled at the early stage of their studies.Item USING T-WAY INTERACTION TECHNIQUES FOR THE REDUCTION IN THE NUMBER OF TEST CASES(Department of Computer Science, University of Ilorin., 2017) Ajiboye, A.R.; Mejabi, O.V.; Salihu, S.A.A test case is a set of input data designed to discover a particular type of error or defect in the software system. In order to develop software that perform as expected, extensive testing should be carried out to ensure reliability. Ideally, software testers would want to test every possible permutation of the software, but in practice, due to the complexity of the software, exhaustive testing is usually not feasible. This paper presents the use of tway interaction techniques with a view to reducing the number of test cases in the process of software testing. The software on which the approach is implemented consists of parameters that have the same number of values and their interaction is based on pairwise combination. The technique minimizes the number of test cases as it tests all pairs of variables. The resulting outputs show a significant reduction in the number of test cases from 8 to 6; this is a 25 % reduction. Thus, the overall time required to test the software is optimized. Also, the final reduced test cases are found to be free of redundancy and the technique used shows a high degree of parameter interaction.