SOFTWARE DEFECT PREDICTION: ANALYSIS OF CLASS IMBALANCE AND PERFORMANCE STABILITY

Balogun, Abdullateef Oluwagbemiga; Basri, Shuib; Said, Jadid Abdulkadir; Adeyemo, Victor Ebenezer; Imam, Abdullahi Abubakar; Bajeh, Amos Orenyi

SOFTWARE DEFECT PREDICTION: ANALYSIS OF CLASS IMBALANCE AND PERFORMANCE STABILITY

dc.contributor.author	Balogun, Abdullateef Oluwagbemiga
dc.contributor.author	Basri, Shuib
dc.contributor.author	Said, Jadid Abdulkadir
dc.contributor.author	Adeyemo, Victor Ebenezer
dc.contributor.author	Imam, Abdullahi Abubakar
dc.contributor.author	Bajeh, Amos Orenyi
dc.date.accessioned	2020-01-30T10:54:47Z
dc.date.available	2020-01-30T10:54:47Z
dc.date.issued	2019-12
dc.description.abstract	The performance of prediction models in software defect prediction depends on the quality of datasets used for training such models. Class imbalance is one of data quality problems that affect prediction models. This has drawn the attention of researchers and many approaches have been developed to address this issue. In this study, an extensive empirical study is presented, which evaluates the performance stability of prediction models in SDP. Ten software defect datasets from NASA and PROMISE repositories with varying imbalance ratio (IR) values were used as the original datasets. New datasets are generated from the original datasets using undersampling (Random under Sampling: RUS) and oversampling (Synthetic Minority Oversampling Technique: SMOTE) methods with different IR values. The sampling techniques were based on the equal proportion (100%) of the increment (SMOTE) of minority class label or decrement (RUS) of the majority class label until each dataset is balanced. IR is the ratio of the defective instances to non-defective instances in a dataset. Each newly generated datasets with different IR values based on different sampling techniques were randomized before applying prediction models. Nine standard prediction models were used on the newly generated datasets. The performance of the prediction models was measured using the Area Under Curve (AUC) and Co-efficient of Variation (CV) is used to determine the performance stability. Firstly, experimental results showed that class imbalance had a negative effect on the performance of prediction models and the oversampling method (SMOTE) enhanced the performances of prediction models. Secondly, Oversampling method of balancing datasets is better than using Undersampling methods as the latter had poor performance as a result of the random deletion of useful instances in the datasets. Finally, among the prediction models used in this study, it appeared that Logistic Regression (LR) (RUS: 30.05; SMOTE: 33.51), Naïve Bayes (NB) (RUS: 34.18; SMOTE: 33.05), and Random Forest (RF) (RUS: 29.24; SMOTE: 64.25) with their respective CV values are more stable prediction models and they work well with imbalanced datasets.	en_US
dc.identifier.issn	1823-4690
dc.identifier.uri	http://hdl.handle.net/123456789/3595
dc.language.iso	en	en_US
dc.publisher	School of Engineering, Taylor’s University	en_US
dc.relation.ispartofseries	14;6
dc.subject	Software defect prediction	en_US
dc.subject	machine learning	en_US
dc.subject	class imbalance	en_US
dc.subject	Data quality	en_US
dc.title	SOFTWARE DEFECT PREDICTION: ANALYSIS OF CLASS IMBALANCE AND PERFORMANCE STABILITY	en_US
dc.type	Article	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 14_6_16.pdf
Size:: 617.06 KB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.69 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

SOFTWARE DEFECT PREDICTION: ANALYSIS OF CLASS IMBALANCE AND PERFORMANCE STABILITY

Files

Original bundle

License bundle

Collections